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Abstract 

We prove that under the Gaussian measure, half-spaces are uniquely 
the most noise stable sets. We also prove a quantitative version of 
uniqueness, showing that a set which is almost optimally noise sta- 
ble must be close to a half-space. This extends a theorem of Borcll, 
who proved the same result but without uniqueness, and it also an- 
swers a question of Ledoux, who asked whether it was possible to prove 
Borell's theorem using a direct semigroup argument. Our quantitative 
uniqueness result has various applications in diverse fields. 

1 Introduction 

Gaussian stability theory is a rich extension of Gaussian isoperimetric the- 
ory. As such it connects numerous areas of mathematics including prob- 
ability, geometry [6], concentration and high dimensional phenomena [29], 
re- arrangement inequalities [71116] and more. On the other hand, this theory 
has recently found fascinating applications in combinatorics and theoretical 
computer science. It was essential in [32 j for proving the "majority is sta- 
blest" conjecture [T7JI23], the "it ain't over until it's over" conjecture [19], 
and for establishing the unique games computational hardness |22] of nu- 
merous optimization problems including, for example, constraint satisfaction 
problems [2l fT2lf2¥ll35] . 

The standard measure of stability of a set is the probability that posi- 
tively correlated standard Gaussian vectors both lie in the set. The main 
result in this area, which is used in all of the applications mentioned above, is 
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that half-spaces have optimal stability among all sets with a given Gaussian 
measure. This fact was originally proved by Borell [6], in a difficult proof 
using Ehrhard symmetrization. Recently, two different proofs of BorelFs 
result have emerged. First, Isaksson and the first author [16] applied some 
recent advances in spherical symmetrization [7] to give an proof that also 
generalizes to a problem involving more than two Gaussian vectors. Then 
Kindler and O'Donnell [25], using the sub-additivity idea of Kane [20], gave 
a short and elegant proof, but only for sets of measure 1/2 and for some 
special values of the correlation. 

In this paper, we will give a novel proof of Borell's result. In doing so, we 
answer a question posed 18 years ago by Ledoux [27], who used semigroup 
methods to show that Borell's inequality implies the Gaussian isoperimetric 
inequality and then asked whether similar methods could be used to give a 
short and direct proof of Borell's inequality. Moreover, our proof will allow 
to strengthen Borell's result and its discrete applications. First, we will 
demonstrate that half-spaces are the unique optimizers of Gaussian stability 
(up to almost sure equality). Then we will quantify this statement, by 
showing that if the stability of a set is close to optimal given its measure, 
then the set must be close to a half-space. 

The questions of equality and robustness of isoperimetric inequalities 
can be rather more subtle than the inequalities themselves. In the case of 
the standard Gaussian isoperimetric result, it took about 25 years from the 
time the inequality was established [5[[36] before the equality cases were 
fully characterized [8] (although the equality cases among sufficiently nice 
sets were known earlier p3]). Robust versions of the standard Gaussian 
isoperimetric result were first established only recently |10| l31j . Here, for 
the first time since Borell's original proof [6] more than 25 years ago, we 
establish both that half-spaces are the unique maximizers and that a robust 
version of this statement is also true. 

1.1 Discrete applications 

From our Gaussian results, we derive robust versions of some of the main 
discrete applications of Borell's result, including a robust version of the 
"majority is stablest" theorem [32] . The "majority is stablest" theorem 
concerns subsets A of the discrete cube {-l,l} n with the property that 
each coordinate Xi has only a small influence on whether x e A (see |32j for 
a precise definition); the theorem says that over all such sets A, the ones with 
that are most noise stable take the form {x ■ £ a i%i ^ b}. From the results we 
prove here, it is possible to obtain a robust version of this, which says that 
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any sets A c {-1, l} n with small coordinate influences and almost optimal 
noise sensitivity must be close to some set of the form {x ■ Y, a i x i ^ b}. 

A robust form of the "majority is stablest" theorem immediately implies 
a robust version of the quantitative Arrow theorem. In economics, Arrow's 
theorem pQ says that any non-dictatorial election system between three can- 
didates which satisfies two natural properties (namely the "independence 
of irrelevant alternatives" and "neutrality") has a chance of producing a 
non-rational outcome. (By non-rational outcome, we mean that there are 
three candidates, A, B and C say, such that candidate A is preferred to 
candidate B, B is preferred to C and C is preferred to A.) Kalai [17]ITB] 
showed that if the election system is such that each voter has only a small 
influence on the outcome, then the probability of a non-rational outcome 
is substantial; moreover, the "majority is stablest" theorem |32| implies 
that the probability of a non-rational outcome can be minimized by using 
a simple majority vote to decide, for each pair of candidates, which one is 
preferred. A robust version of the "majority is stablest" theorem implies im- 
mediately that (weighted) majority-based voting methods are essentially the 
only low-influence methods that minimizes the probability of a non-rational 
outcome. 

In a different direction, our robust noise stability result has an applica- 
tion in hardness of approximation, specifically in the analysis of the well- 
known Max-Cut optimization problem. The Max-Cut problem seeks a par- 
tition of a graph G into two pieces such that the number of edges from one 
piece to the other is maximal. This problem is NP-hard [21] but Goemans 
and Williamson |15| gave an approximation algorithm with an approxima- 
tion ratio of 0.878. Their algorithm works by embedding the graph G on 
a high-dimensional sphere and then cutting it using a random hyperplane. 
Feige and Schechtman p3] showed that a random hyperplane is the optimal 
way to cut this embedded graph; with our robust noise stability theorem, we 
can show that any almost-optimal cutting procedure is almost the same as 
using a random hyperplane. The latter result is derived via a novel isoperi- 
metric result for spheres in high dimensions where two points are connected 
if their inner product is exactly some prescribed number p. 

1.2 Borell's theorem and a functional variant 

Let 7 n be the standard Gaussian measure on M. n . For -1 < p < 1 let X 
and Y be jointly Gaussian random vectors on M n , such that X and Y are 
standard Gaussian vectors and KX{Yj = 5ijp. We will write Pr p for the joint 
probability distribution of X and Y. We will also write 4> for the density of 
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7i and $ for its distribution function: 



<f>(x) - ^=^ 2/2 

V Z7T 



r 

${x) = / <f>(y) dy. 
Theorem 1.1 (Borell [6j). For any < p < 1 and any measureable Ai,A 2 c 
Pr p (X e Ai.y 6 A 2 ) < Pr p (X eB^e 5 2 ) (1.1) 

w/iere 

S 1 = {x€M n :xi<$- 1 (7r l (^l))} 
and B 2 = {x£R n :x 1 < (j n (A 2 ))} 

are parallel half-spaces with the same volumes as A\ and A 2 respectively. 
If -1 < p < then the inequality (jl.ip is reversed. 

Like many other inequalities about sets, Theorem 11.11 has a functional 
analogue. To state it, we define the function 

J(x,y) = J(x,y;p) = Pr p (Xx < IT 1 ^), Yi < ^(y)). 

Theorem 1.2. For any measurable functions f,g : M n -> [0,1] and any 
0<p< 1, 

E p J(f(X),g(Y);p) < J(Ef,Eg;p) (1.2) 
If -1 < p < i/ien £/ie inequality (11.2P is reversed. 

To see that Theorem 11.21 generalizes Theorem 1 1.1[ consider / = 1^ and 
y = U 2 . Note that J(0,0) = J(1,0) = J(0,1) = 0, while J(l,l) = 1. Thus, 
J(/(X),y(y)) = lxeyii,yeA2 an d so the left hand side (resp. right hand side) 
of Theorem 11.21 is the same as the left hand side (resp. right hand side) of 
Theorem 11.11 

In fact, we can also go in the other direction and prove Theorem 11.21 
from Theorem O given f,g : W n -> [0,1], define A 1 ,A 2 c R n+1 to be the 
epigraphs of o / and <I>~ 1 o g respectively. It can be easily checked, then, 
that 

E p J(f(X),g(Y);p) = Pr p (X e A U Y e A 2 ) 

where X and Y are standard Gaussian vectors on M. n+1 with EXiYi = 5{jp. 
On the other hand, Ef = j n +i(Ai) and Eg = j n +i(A 2 ) and so the definition 
of J implies that 

J(Ef,Mg;p) =Pi p (X e B U Y e B 2 ) 
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where B\ and B2 are parallel half-spaces with the same volumes as A\ 
and A2. Thus, Theorem 11.11 in n+ 1 dimensions implies Theorem 11.21 in n 
dimensions. 

However, we will give a proof of Theorem 1 1 . 2 1 that does not rely on The- 
orem 11.11 We do this for two reasons: first, we believe that our proof of 
Theorem 11.21 is simpler than existing proofs of Theorem 11.11 More impor- 
tantly, our proof of Theorem II .21 is a good starting point for the main results 
of the paper. In particular, it allows us to characterize the cases of equality 
and near-equality. As we mentioned earlier, it is not known how to get such 
results from existing proofs of Theorem 11.11 

1.3 Our results: Equality 

In our first main result, we get a complete characterization of the functions 
for which equality in Theorem 11.21 is attained. 

Theorem 1.3. For any measurable functions f,g : W 1 -> [0, 1] and any -1 < 
p < 1 with p + 0, if equality is attained in (11.21) then there exist a,b,d e M. n 
such that either 

f(x) = &((a,x - b)) a.s. 
g(x) = <l?((a, x - d)) a.s. 

or 

/O) = 1 (a,a;-6)>0 a - s - 

g(%) = 1 (a,x-d}>o a.s. 

In particular, the second case of Theorem 11.31 implies that if A\ and A2 
achieve equality in Theorem 11.1 1 then A± and A2 must be almost surely equal 
to parallel half-spaces. 

1.4 Our results: Robustness 

Once we know the cases of equality, the next natural thing to ask is whether 
they are robust: if / and g almost achieve equality in (jl.2p - in the sense 
that E p J(f(X),g(Y)) > J(Ef,Eg)-6- does it follow that / and g must be 
close to some functions of the form <&((a,x-b))7 In the case of the Gaussian 
isoperimetric inequality, which can be viewed as a limiting form of Borell's 
theorem, the question of robustness was first addressed by Cianchi et al. [10] . 
who showed that the answer was "yes," and gave a bound that depended 
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on both 5 and n. The authors [31] then proved a similar result which had 
no dependence on n, but a worse (logarithmic, instead of polynomial) de- 
pendence on 5. The arguments we will apply here are similar to those used 
in [31j . but with some improvements. In particular, we establish a result 
with no dependence on the dimension, and with a polynomial dependence 
on 5 (although we suspect that the exponent is not optimal). 

Theorem 1.4. For measurable functions f,g ■ M n -> [0, 1], define 

^ = S(f,g) = J(Ef,Eg)-E p J(f(X),g(Y)) (1.3) 

and let 

m = m(f,g) = E/(l - E/)E 5 (1 - Eg). 

For any < p < 1 and any e > 0, there exist < c(p),C(p,e) < oo such that 
for any /, g ■ W 1 -*■ [0, 1] there exist a,b,d € W 1 such that 

E|/(X) - $((a,X - 6))| < C(p, e)m c(p) 5 3 i-p+3p^"p a ^ 

2 3 

E\g(X) - $({a, X - d))| < C(p, ejm^^^ 5 ^" 6 . 

We should mention that a more careful tracking of constants in our proof 
would improve the exponent of 5 slightly. However, it would not bring the 
exponent above \. 

Although Theorem 11.41 is stated only for < p < 1, the same result for 
-1 < p < follows from certain symmetries. Indeed, one can easily check from 
the definition of J that J(x, y;p)=x - J(x, 1 - y; -p). Taking expectations, 

E p J(f(X),g(Y);p) = Ef - E p J(f(X), 1 - g(Y); -p) 

= Ef-E- p J(f(X),l-g(-Y);-p). 

Now, suppose that -1 < p < and that f,g almost attain equality in 
Theorem OJ 

E p J(f(X),g(Y);p) < J(Ef, Eg; p) + 5. 
Setting g(y) = 1 - g(-y), this implies that 

E. p J(f(X),g(Y); -p) > J(Ef,E~g; -p) - 5. 

Since < —p < 1, we can apply Theorem 11.41 to / and g to conclude that 
/ and g are close to the equality cases of Theorem 11.31 and it follows that 
/ and g are also close to one of these equality cases. Therefore, we will 
concentrate for the rest of this article on the case < p < 1. 
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1.5 Optimal dependence on p in the case f = g 

The dependence on p in Theorem 11.41 is particularly interesting as p -> 
1, since it is in that limit that Borell's inequality recovers the Gaussian 
isoperimetric inequality. As it is stated, however, Theorem 11.41 does not 
recover a robust version of the Gaussian isoperimetric inequality because of 
its poor dependence on p as p -*■ 1. In particular, as p -*■ 1, the constant 
C(p, e) grows to infinity, and the exponent of 5 tends to zero. 

It turns out that this poor dependence on p is necessary in some sense. 
For example, let 

f( x ) = 1 {x 1 <0} 

g(x) = 1{ X1 <2,2:2<0 or xi<l,cC2>0} • 

Then 

Pr p (f(X) = l,g(X) = 0) < Pr p (X! < 0, Y x > 1) < exp ( 2(1 ^ 2) ), 

which tends to zero very quickly as p -> 1. In particular, this means that as 
p -> 1, S(f,g) tends to zero exponentially fast even though g is a constant 
distance away from a half-space. Thus, the constant C(p,e) must blow up 
as p -*■ 1. Similarly, if we redefine g as 

g(x) = 1 

{a;i<l+O(5),a;2<0 or a;i<l-O(<5),X2>0} 

then we see that the exponent of 5 in Theorem 11.41 must tend to zero as 

We can, however, avoid examples like the above if we restrict to the case 
f - g. In this case, it turns out that S(f, f) grows only like (l-p) -1 / 2 as p -* 
1, which is exactly the right rate for recovering the Gaussian isoperimetric 
inequality. 

Theorem 1.5. For every e > 0, there is a po < 1 and a C(e) such that for 
any p < p < 1 and any f : W l -*■ [0, 1] with Ef = 1/2, there exists a e R n 
such that 

nf(X)-$((a,X))\<C(e)( 6 -^dlf~ e . 

The requirement E/ = 1/2 is there for technical reasons, and we do not 
believe that it is necessary (see Conjecture 16. 9p . 

By applying Ledoux's result |28| connecting Borell's inequality with the 
Gaussian isoperimetric inequality, Theorem 11.51 has the following corollary 
(for the definition of Gaussian surface area, see |31j): 
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Corollary 1.6. For every e > 0, there is a C(e) < oo such that for every set 
A c W 1 such that Pr(^4) = 1/2 and A has Gaussian surface area less than 
-7= + 5, there is a half-space B such that 

V 2-7T 

Pi(A/S.B) < C(e)5 1/A ~ € . 

This should be compared with the work of Cianchi et al [llj . who gave 
the best possible dependence on 5, but suffered some unspecified dependence 
on n: 

Theorem 1.7. For every n and every a e (0, 1), there is a constant C(n,a) 
such that for every set A c M n such that Pr(^4) = a and A has Gaussian 
surface area less than (/>(<E> _1 (a)) + 5, there is a half-space B such that 

Pr(AAB) < C(n,a)5 1/2 . 

Note that Theorem 11.71 is stronger than Corollary 11.61 in two senses, but 
weaker in one. Theorem II .71 is stronger since it applies to sets of all volumes 
and because it has a better dependence on S (in fact, Cianchi et al show 
that 5 l l 2 is the best possible dependence on 5). However, Corollary 11.61 is 
stronger in the sense that it - like the rest of our robustness results - has 
no dependence on the dimension. For the applications we have in mind, 
this dimension independence is more important than having optimal rates. 
Nevertheless, we conjecture that it is possible to have both at the same time: 

Conjecture 1.8. There are constants < c,C < oo such that for every 
A c W l with Gaussian surface area less than (f>(^~ 1 (Pi(A))) + 5, there is a 
half-space B such that 

Pi(AAB) <CPv(A) c 5 1/2 . 
1.6 On highly correlated functions 

Let us mention one more corollary of Theorem 11.51 We have used the func- 
tional E„J(/(X), f(Y)) as a functional generalization of Pr p (X e A,Y e ^4). 
However, K p f(X)f(Y) is another commonly used functional generaliza- 
tion of Pr p (X € A,Y € A) which appeared, for example, in [28]. Since 
xy < J(x,y) for < p < 1, we see immediately that Theorem 11.21 holds when 
the left hand side is replaced by K p f(X)f(Y). The equality case, how- 
ever, turns out to be different: whereas equality in Theorem 11.21 holds for 
f(x) = <E>((a,x - b)), there is equality in 

E p f(X)f(Y)<J(Ef,Ef,p) (1.4) 
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only when / is the indicator of a half-space. Moreover, a robustness result 
for (jl.4p follows fairly easily from Theorems 11.41 and 11.51 

Corollary 1.9. For any < p < 1, there is a constant C{p) < oo such that 
if f : K n h> [0, 1] satisfies Ef = 1/2 and 

E/(X)/(y)>^ + ^arcsin(p)-(5 

t/ien t/iere is a half-space B such that 

E\f(X)-l B (X)\<C(p)S c , 
where c> is a universal constant. 

1.7 Discrete applications 

Corollary 11.91 implies a robust version of the "majority is stablest" theo- 
rem [32], which concerns functions of low influence and high noise stability; 
for a function / : {-1, l} n -»■ {-1,1}, we define the influence of the ith 
coordinate by 

Infj(/) = Pr(/(xi,. . . ,x n ) * f(xi, . . . ,Xi-i,-Xi,x i+1 ,. . .,x n )) 
and the noise stability of / by 

S„(/)=E p /(0/(<7) 

where (£,a) = ((&, . . . , £ n ), (cri, . . . , a n )) e {-1,1}" x {-l,l} n is chosen so 
that (£i,o~i) e {-1, l} 2 are independent random variables with E£j = Eo"j = 
and Ep^o-j = p. 

The majority is stablest theorem |32j informally states that low-influence, 
balanced functions cannot be essentially more noise-stable than the majority 
function. This was first explicitly conjectured by Khot, Kindler, Mossel, and 
O'Donnell [23] in a paper studying the hardness of approximation of Max- 
Cut. It was used to show that approximating the maximum cut in a graph 
to within a factor of about 0.87856 is unique-games hard. This result is opti- 
mal, since the famous efficient algorithm of Goemans and Williamson [15] is 
guaranteed to find a cut that is within a 0.87856 factor of the maximum cut. 
A special case of the majority is stablest theorem was conjectured earlier by 
Kalai [17J in the context of his quantitative version of Arrow's theorem. 

Combining our Gaussian results with the original proof from [32], we 
obtain a robust version of the majority is stablest theorem: 
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Theorem 1.10. For every e > 0, there is a r > such that the following 
holds: suppose that f ■ {-l,l} n [0,1] is a function with Infj(/) < r for 
every i. Then for every < p < 1, 

§„(/)< J(E/,E/;p) + e. (1.5) 

//, moreover, there is some < p < 1 such that 

S p (f)> J(E/,E/;p)-e (1.6) 

then there exist a, b e R n such that 

nf(O-M(a,s-b)> }\<C(p)e c , 

where c, C > are universal constants. 

If we set a n = — r=(l, • • • , 1) and 6 n = $ _1 (E/)a n , then the central limit 

■\/Tl 

theorem implies that El {(an € _ fcn> > 0} -> E/ and §p(l{< „,£-6„)>0}) ~* «/(E/,E/;p). 
In the case E/ = | and 6 n = 0, says, therefore, that no low-influence 

function can be much more noise stable than the simple majority function - 
this is the content of the majority is stablest theorem from [32]. Our contri- 
bution is (I1.6P , which says that the only low- influence functions which come 
close to this bound are close to weighted majority functions. 

We remark that Theorem 11.101 is not the most general possible theorem 
that we can prove. In particular, we could state a two-function version of 
Theorem ll.lOl or a version that uses E p J(f(£),f(a);p) in place of S p (f). All 
of these variations, however, are proved in essentially the same way, namely 
by combining the ideas from [32] with the appropriate Gaussian robustness 
result. In order to avoid repetition, therefore, we will only state and prove 
one version. 

1.8 Spherical noise stability and the Max-Cut problem 

The well-known similarity between a Gaussian vector and a uniformly ran- 
dom vector on a high-dimensional sphere suggests that there might be a 
spherical analogue of our Gaussian noise sensitivity result. The correla- 
tion structure on the sphere that is most useful is the uniform measure 
over all pairs of points (x,y) whose inner product (x,y) is exactly p. Un- 
der this model of noise, we can use robust Gaussian noise sensitivity to 
show, asymptotically in the dimension, robustness for spherical noise sen- 
sitivity. This uses the theory of spherical harmonics and has applications 
to rounding semidefinite programs (in particular, the Goemans- Williamson 
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algorithm for Max-Cut). Our proof uses and generalizes the work of Klartag 
and Regev [26], in which a related problem was studied in the context of 
one-way communication complexity. 

Our spherical noise stability result mostly follows from Theorem 11.41 by 
replacing X and Y by and When n is large, these renor- 

malized Gaussian vectors are uniformly distributed on the sphere and their 
inner product is tightly concentrated around p. The fact that their inner 
product is not exactly p causes some difficulty, particularly because Q p is 
actually orthogonal to the joint distribution of two normalized Gaussians. 
Working through this difficulty with some properties of spherical harmonics, 
we obtain the following spherical analogue of Theorem 11.41 

Theorem 1.11. Let < p < 1 and write Q p for the measure of (X, Y) on 
the sphere S" 1-1 where the pair (X, Y) is uniformly distributed in 

{(x,y)€S n - 1 xS n - 1 :(x,y)^p}. 

For measurable Ai,A 2 c S n , define 

5 = J(Ai, A 2 ) = Q P {X t B\,Y e B 2 ) - Q P {X e A h Y e A 2 ), 

where B\ and B 2 are parallel spherical caps with the same volumes as A\ 
and A 2 respectively. Define also 

m(Ai,A 2 ) = p(l - p)q(l - q) 

where p = Pr(X e A\) and q = Pr(Y e A 2 ). 

For any A\,A 2 c S n ~ l , there exist parallel spherical caps B\ and B 2 such 
that 

i i-p- P 2 +p a _ e 
Q(A 1 AB 1 ) < C(p,e)m c ^5^- p+3p ' +pi> 

Q(A 2 AB 2 ) < C{ Pl e)m c ^8l^^~ ( ' . 

where 5* = max(<5, n^ 1 ! 2 logn). 

The case p = of the above theorem is related to work by Klartag 
and Regev [26]. In this case one expects that X and Y should behave as 
independent random variables on 5 n_1 and that therefore for all A\,A 2 , 
Qo(X 6 A X ,Y e A 2 ) should be close to Q(X e A 1 )Q(Y e A 2 ). Indeed the 
main technical statement of Klartag and Regev (Theorem 5.2) says that for 
every two sets, 

\Qo(X e A U Y e A 2 ) - Q{X e A{)Q{Y e A 2 )\ < -. 

n 
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In other words the results of Klartag and Regev show that in the case p = 
0, a uniform orthogonal pair (X,Y) on the sphere behaves like a pair of 
independent random variables up to an error of order ra -1 , while our results 
show that for < p < 1, (X,Y) that are p correlated behave like Gaussians 
with the same correlation. 

1.8.1 Rounding the Goemans- Williamson algorithm 

Let G = (V,E) be a graph and recall that the Max-Cut problem is to find a 
set AcV such that the number of edges between A and V \ A is maximal. 
It is of course equivalent to look for a function / : V ->• {-1,1} such that 
T,( u ,v)eE \ f( u )~f( v )\ 2 is maximal. Goemans' and Williamson's breakthrough 
was to realize that this combinatorial optimization problem can be efficiently 
solved if we relax the range {-1, 1} to S" 1-1 . Let us say, therefore, that an 
embedding / of a graph G = (V,E) into the sphere 5 n_1 is optimal if 

E \f(u)-f(v)\* 

is maximal. An oblivious rounding procedure is a (possibly random) function 
R ■ 5 n_1 -> {-1,1} (we call it "oblivious" because it does not look at the 
graph G). We will then denote by Cut(G,i?) the expected value of the cut 
produced by rounding the worst possible optimal spherical embedding of G: 

Cut(G,i?)^minE £ \R(f(u))-R(f(v))\, 

1 (u,v)eE 

where the minimum is over all optimal embeddings /. If MaxCut denotes 
the maximum cut in G, then Goemans and Williamson [15] showed that 
when R{x) = sgn((X, x)) for a standard Gaussian vector X, then for every 
graph G, 

Cut(G,R) > MaxCut(G)mina e , 

o 

where ag = — ^ e ■ In the other direction, Feige and Schechtman |14j 
showed that for every oblivious rounding scheme R and every e > 0, there is 
a graph G such that 

Cut(G,i?) <MaxCut(G)(e + mmae j. 

In other words, no rounding scheme is better than the half-space rounding 
scheme. Using Theorem 1 1.4^ we can go further: 
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Theorem 1.12. Suppose R is a rounding scheme such that for every G, 
Cut(G,i?) >MaxCut(G)( imnagj - e). 

Then there is a hyperplane rounding scheme R such that 

E\R(Y) - R(Y)\ < Ce c , 

where Y is a uniform (independent of R and R) random vector on S n ~ , 
and C and c are absolute constants. 

In other words, any rounding scheme that is almost optimal is essentially 
the same as rounding by a random half-space. 

1.9 Proof Techniques 

1.9.1 Borell's theorem 

We prove Theorem ll.2l bv differentiating along the Ornstein-Uhlenbeck semi- 
group. This technique was used by Bakry and Ledoux [3] in their proof of the 
Gaussian isoperimetric inequality and, more generally, a Gaussian version of 
the Levy-Gromov comparison theorem. Recall that the Ornstein-Uhlenbeck 
semigroup can be specified by defining, for every t > 0, the operator 

(P t f)(x)= f f(e- t x + Vr^y)d ln (y). (1.7) 

Note that Ptf -* f as t -> (pointwise, and also in L p ), while Ptf -»■ E/ as 
t ->■ OO. 

Let ft = Ptf, gt = Pt9i and consider the quantity 

R t :=E p J(f t (X),g t (Y)). (1.8) 

As t -*■ 0, Rt converges to the right hand side of (11 . 2H : as t -» oo, R t converges 
to the left hand side of (|1 .2j) . We will prove Theorem 11.21 by showing that 
^ > for all t > 0. 

at 

1.9.2 The equality case 

The equality case almost comes for free from our proof of Theorem 11.21 In- 
deed, Lemma [2721 writes ^ i as the expectation of a strictly positive quantity 
times 

|(v(«i>- 1 o/ t ))(x)-(v(<i>- 1 o 54 ))(y)|, 
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where | • | denotes the Euclidean norm. Now, if there is equality in The- 
orem 11.21 then must be zero for all t, which implies that the expres- 
sion above must be zero almost surely. This implies that V($ _1 ° ft) and 
V( < & -1 ° gt) are almost surely equal to the same constant, and therefore ft 
and gt can be written as composed with a linear function. We can then 
infer the same statement for / and g because Pt is one-to-one. 

1.9.3 Robustness 

Our approach to robustness begins similarly to the approach in our recent 
work [31] . If S(f,g) is small then must also be small for most t > 0. 
Looking at the expression in Lemma 12.21 we first concentrate on the main 
term: \\7vt(X) - Vwt(Y)\ 2 where vt = <I> -1 o f t and wt = < I )_1 ° gt- Using 
an analogue of Poincare's inequality, we argue that if the expected value of 
\Vvt(X) - Vwt(Y)\ 2 is small then vt and wt are close to linear functions. 

Considerable effort goes into controlling the "secondary terms" of the 
expression in Lemma 12. 2[ This control is established in a sequence of an- 
alytic results, which rely heavily on the smoothness of the semigroup Pt, 
concentration of Gaussian vectors and LP interpolation inequalities. In the 
end, we show that if 5 = S(f, g) is small then for every t > 0, vt is e{5, t) close 
to a linear function. Since $ is a contraction, this implies that ft must be 
close to a function of the form $((x,a) - b). 

We would like to then conclude the proof by applying Pf , and say- 
ing that / must be close to Pf 1 &((x,a) - b), which also has the form 
$((x,a'} - b'). The obvious problem here is that Pf is not a bounded 
operator, but we work around this by arguing that it acts boundedly on the 
functions that we care about. This part of the argument marks a substantial 
departure from [31] . where our argument used smoothness and spectral in- 
formation. Here, we will use a geometric argument to say that if h = 1a - 1b 
where B is a half-space, then E|/i| can be bounded in terms of E|P^/i|. This 
improved argument is essentially the reason that the rates in Theorem 11.41 
are polynomial, while the rates in [31] were logarithmic. 

2 Proof of Borell's theorem 

Recall the definition of Pt and Rt from (jl.7p and (jl.8p . In this section, we will 
compute and show that it is non-negative, thereby proving Theorem II .21 
First, define vt = < 3?~ 1 ° ft, wt = $ gt, an d K(x,y;p) = Pr p (X < x,Y < b). 
Then 

J(f t (X),g t (Y))=K(v t (X),w t (Y)). 
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Lemma 2.1. 



Proof. Note that Y can be written as pX + yl- p 2 ^, where X and Z in- 
dependent standard Gaussian vectors. Then {X < x,Y < y} = {X < x,Z < 



}, and so 



y-px 

y—ps 

K(x,y) = f X f ^ <p{s)(j){t)dtds. 
Differentiating in x, 

OX J-oo 

This proves the first claim. The second claim follows because K(x,y) is 
symmetric in x and y. □ 

Lemma 2.2. 

dR t p „ / v? + w? - 2pvtWt\, ,9 

— - = ^=E p exp - -L— ± ^pvt - Vw t 2 . 

* 2vr VI - p 2 v 2(1 -p 2 ) / 

Before we prove Lemma 12.21 note that it immediately implies Theo- 
rem [L2] because the right hand side in Lemma 12.21 is clearly non-negative. 



Proof. Set L - A - (x, V); it is well-known (and easy to check by direct 
computation) that ^jjr = L/t for all i > 0. The integration by parts formula 

Ef(X)Lg(X) = -E(V/(X), V<?(X)) (2.1) 



for bounded smooth functions / and g is also standard and easily checked. 
Thus, 

^ = E4^K(X),. t (Y))^) + E4^(, t (X),^(Y))^). 

(2.2) 
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Now, the chain rule implies that ^ = ^^y- Hence, the first term of (|2.2p is 

M^W 1 *)) " ^ r (Y) r^ X) )L!,(x), (2.3) 



where we have used Lemma 12.11 Now write Y = pX + yl- p 2 Z (with X 
and Z independent); conditioning on Z and and applying the integration by 
parts (|2.ip with respect to X, we have 



Vi-p 2 Vi-p 2 

= ^=Ep0(^=)^)(V^-V^ t ,V^). (2.4) 

where we have written, for brevity, ft and iot instead of and wt{Y). 

Since K is symmetric in its arguments, there is a similar computation for 
the second term of (j2.2f) : 



P 



E, 



Note that 



f^l^XVtt-Vwt.ViDt). (2.5) 



hence, we can plug (|2.4p and (|2.5[) into (12. 2p to obtain 

-7T = , - Eexp — I Vut-Vwt . 

rf* 2vrVl - p 2 V 2 (l-p 2 ) 7 



□ 



3 The equality case 

Lemma 12.21 allows us to analyze the the equality case (Theorem I1.3P , with 
very little additional effort. Similar ideas were used by Carlen and Kerce [9] 
to analyze the equality case in the standard Gaussian isoperimetric problem. 
Clearly, Lemma [2.21 implies that if for every t, vt and wt are linear functions 
with the same slope, then equality is attained in Theorem 11.21 To prove 
Theorem 11.31 we will show that the converse also holds (ie. if equality is 
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attained then vt and w% are linear functions with the same slope). Then we 
will take t -*■ to obtain the desired conclusion regarding / and g. 
First of all, if f(x) = l{ a ,x-fe>>0' then a direct computation gives 



where kt = (e 2 * - l) -1 / 2 . Since Pt is injective, it follows that whenever 
ft = x - b')) for some a, b with \a\ = kt, f must have the form f(x) = 
l{( a ,x-b}>o}- Since, moreover, kt is decreasing in t, we have the following 
lemma: 

Lemma 3.1. If ft(x) = <3?((a, x - b')) for some a,b' e R n with \a\ < kt, then 
there exists b e R n such that if f(x) = lu a ,x-b)>0} then f = P s f, where s 
solves \a\ = k s+ t- 

In order to apply Lemma l3.lt we will use the following pointwise bound 
on Vvt, whose proof can be found in [3]. Note that the bound is sharp 
because, according to (]3.ip . equality is attained when / is the indicator 
function of a half-space. 

Lemma 3.2. For any function f : M. n -> [0, 1], any t > 0, and any x e M. n , 

\W t (x)\ < k t . 

Proof of Theorem \1.3[ Suppose that equality is attained in (ll.2p . Since ^ 1 
is non-negative, it must be zero for almost every t > 0. In particular, we 
may fix some t > such that = 0. Note that everything in Lemma [2721 
is strictly positive, except for the last term, which can be zero. Therefore, 
= implies that Vv t (X) = Vwt(Y) almost surely. Since the conditional 
distribution of Y given X is fully supported, Vft and Vu>t must be almost 
surely equal to some constant a' e W 1 . Moreover, vt and wt are smooth 
functions (because ft, gt and <3?~ 1 are smooth); hence, vt(x) = (a,x-b'} and 
wt{x) - (a,x - d!) for some b' ,d! e R n , and so 

ft(x) = <S>((a,x-b')) 
g t (x) = $((a,x-d')). 

Now, Lemma 13.21 asserts that \a\ = \Vvt\ < kt. Hence, Lemma 13.11 implies 
that there is some b such that if f(x) = l( a ,x-b}>o then / = P s f , where s 
solves \a\ = k s+ t. In particular, / takes one of the two forms indicated in 
Theorem II .31 if s = then f(x) = f(x) = l( a ,a:-fe>>o- On the other hand, s > 
implies, by (|3.ip . that f s = Q{k s {-^ ,x-e s b)), which we can write in the form 
^((a, x - b}) by replacing k s -^ with a and k s e s b with b. We complete the 
proof by applying the same argument to g. □ 
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4 Robustness: approximation for large t 



The proof of Theorem 11.41 follows the same general lines as the one in |31j . 
Our starting point is Lemma [2.2l and the observation that if (jl.2p is close to 
an equality then must be small for most t. For such t, using Lemma [2.21 
we will argue that vt must be close to linear for that t; it then follows that 
ft must be close to one of the equality cases in Theorem 11.31 Finally, we 
use a time-reversal argument to show that / must be close to one of those 
equality cases also. 

Our proof will be divided into two main parts. In this section, we will 
show that vt is close to linear; we will give the time-reversal argument in 
Section [5j The main result in this section, therefore, is Proposition 14.1} 
which says that ft must be close to one of the equality cases of Theorem II .31 
Recall the definition of 5 from (jl.3[) . and recall that kt - (e 2 * - l) 1 / 2 . 

Proposition 4.1. For any < p< 1, and for any t > 0, there exists C(t,p) 
such that for any f,g and for any < a < 1, there exist b, d e R and a e W 1 
with \a\ < kt such that 

E(f t (X) - <D«a, X) - b)f + E(g t (X) - *((a,X) - d)f 

< C(t, p)m(f, g) «?c£?) a (i+«o I -) W/ci-') ^ 

where m(f,g) = E/(l - Ef)Eg(l -Eg). 

Let us observe - and this will be important when we apply Proposi- 
tion 14.11 - that by Lemma 13.11 \a\ < kt implies that <I>((a, •) - b) can be 
written in the form Pt +S ^-B for some s > and some half-space B. 

The main goal of this section is to prove Proposition 14.11 The proof 
proceeds according to the following steps: 

• First, using a Poincare-like inequality (Lemma I4.2p we show that if 
E p \S/v(X)- Vu>(Y)| 2 is small then v and w are close to linear functions 
(with the same slope). 

• In Proposition 14.31 we use the reverse Holder inequality and some 
concentration properties to show that if ^ t is small, then E p \Vvt(X)- 
Vwt(Y)\ 2p must be small for some p < 1. 

• Using Lemma f3.21 we argue that if E p \Vvt(X)-\7wt(Y)\ 2p is small then 
E p \Vvt(X) - \7wt(Y)\ 2 is also small. Thus, we can apply the Poincare 
inequality mentioned in the first bullet point, and so we obtain linear 
approximations for vt and wt- 
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4.1 A Poincare-like inequality 

Recall that we proved the equality case by arguing that if 4jk = then 
\\/vt(X) - \/wt(Y)\ is identically zero, so Vt»t and Vwt must be constant and 
thus vt and wt must be linear. The first step towards a robustness result is 
to show that if \Vvt(X) - Vwt(Y)\ is small, then vt and wt must be almost 
linear, and with the same slope. 

Lemma 4.2. For any smooth functions v,w e L2(M. n ,^ n ), if we set a = 
|(E XJv + ES7w) then for any < p < 1, 

E(v(X) - (X, a) - Ev) 2 + E(w(X) - (X, a) - Ew) 2 < E p\^ v ( X ) ~^ W ( Y )\\ 

2(1 -p) 

We remark that Lemma [4.21 achieves equality when v and w are quadratic 
polynomials which differ only in the constant term. 

In order to prove Lemma 14. 2 1. we recall the Hermite polynomials: for 
k e N, define H k (x) = (k\y l l 2 e x ' ''I 2 '^e~ x ' </2 ■ It is well-known that the H k 
form an orthonormal basis of ^2(^,71). For a multiindex a e N n , let 

H a (x) = f\H ai ( Xi ). 

i=i 

Then the H a form an orthonormal basis of L 2 (R n ,7„). Define \a\ = Y,i a u 
note that H a is linear if and only if \a\ = 1, and «j = implies that -^—H a = 0. 
If Qj > 0, define Sia by (SiOt)i = on - 1 and (Sia)j = ctj for j + i. Then a 
well-known recurrence for Hermite polynomials states that 

d _ \^/a~iH Sia if on > 
dxi a 1 if cti = 0. 

In particular, 

E (|- H «) 2 = °- < 41 > 

It will be convenient for us to reparametrize the Ornstein-Uhlenbeck 
semigroup Pt: for < p < 1, let T p = P\ og ri/ p y It is then easily checked that 
for any v € L 1 (R n ,7 n ), E p (v(Y)\X) = (T p v)(X). 

The final piece of background that we need before proving Lemma 14.21 is 
the fact that T p acts diagonally on the Hermite basis, with 

T p H a = pHtf Q . (4.2) 
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Proof of Lemma \4-S\ First, consider two arbitrary functions b(x),c(x) e 
L2(M n ,7n) and suppose that their expansions in the Hermite basis are 
b = Y,a b a H a and c = Y, a c a H a . Then 

E p (b(X) - c(Y)f = Eb 2 + Ec 2 - 2E p b(X)c(Y) 

= Eb 2 + Ec 2 - 2Eb(X)(T p c)(X) 



where we have used (|4,2|) in the last line to compute the Hermite expansion 
of T p c. Now, 2b a c a <b 2 a + c 2 and so 



E p (b(X) - c(Y)) 2 = (6 - c ) 2 + E (£ + 4" 2p |Q| ^c Q ) 

|a]>l 

>(6 -co) 2 + E(6 2 + c 2 )(l-pH) 

|a]>l 

> (6o _ Co) 2 + (1 _ /0) £ (b 2 a + c 2 a ). (4.3) 
|a]>l 

Now write v and u; in the Hermite basis as v - £ v a H a and u; = £ w a H a . 
Then, by (l4TT|) . 

ax i ai>l 

JT- = E w a\/aiHs i a- 
clx i cti>l 

In particular, if we set 6 = J^, then &5 iQ! = s/aiV a for any a with > 1. 
Specifically, 60 - ^e; (where e« is the multi-index with 1 in position i and 
elsewhere) and 

E b l = E = E 

H>1 |«|>2,a,>l \a\>2,ai>l 

(Setting c = J-^, there is of course an analogous inequality for c and w.) 
Applying this to (|4.3j) . we have 

E p (^(X)-^(Y)) 2 >(v ei -w ei ) 2 + (l-p) E + M) 

VOTj OXi I \a\>2, ai >l 

Now if we apply (|4.4p for each i and sum the resulting inequalities, we obtain 

E p \Vv(X) - Vw(Y)\ 2 > E K - ^a) 2 +2(1 - p) E + w l- ( 4 - 5 ) 

|a|=l |a|>2 
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On the other hand, let a = ^(E\7v + E\7w). Since Ej^ = v £i and H £i (x) = 
Xi, it follows that 

(x, a) = - Y, ( v a + w a )H a {x). 
1 |«|=i 

Since Eu = t>o , we have 

E(v(X)-(X,a)-Ev) 2 = Y, ( V -^f + £t£. 

M=i 2 |q|>2 

Adding to this the analogous expression for w, we obtain 
2(1 - p)(E(v(X) - {X, a) - Ev) 2 + E(w(X) - {X, a) - Ew) 2 ) 

= (l-p) £ (Va-W a ) 2 + 2(l-p) Y vl + W 2 a . 
\a\=l \a\>2 

Noting that 1 - p < 1, we see that this is smaller than ()4.5p . Hence 

(E(v(X) - (X, a) - Ev) 2 + E(w(X) - (X, a) -Eit;) 2 ) < E P I^(*) - VwQOI^ 

2(1 -p) 

□ 



4.2 A lower bound on 

at 

Recall the formula for ^ i given in Lemma 12.21 In this section, we will 
use the reverse-Holder inequality to split this formula into an exponential 
term and a term depending on \Vvt(X) - Vwt(X)\. We will then use the 
smoothness of vt and wt to bound the exponential term, with the following 
result: 

Proposition 4.3. For any < p < 1 and any t > 0, there is a c(t,p) > 
such that for any r < i +ik y^i_ p ^ an d f or an U f and 9, 

> c(t,p)m 2 ^^(E\Vv t (X) - Vw t (Y)\ 2r Y /r . 

There are three main ingredients in the proof of Proposition 14,31 The 
first is the reverse-Holder inequality, which states that for any functions 
/ > and g > and for any j3 > and < r < 1 with - — ^ = 1, 

E/ 5 >(E/^)- 1//3 (E/) 1/r . (4.6) 

The second ingredient involves concentration properties of the Gaussian 
measure. The proof is a standard computation, and we omit it. 
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Lemma 4.4. Iff : R n -> R is 1-Lipschitz with median M then for any A < 1, 

Eexp(A/ 2 (X)/2) < 2 p 2(^a) m2 
vl-A 

The third and final ingredient is a relationship between the mean of / 
and the median of vt- 



Lemma 4.5. If Nt is a median of vt then 

Nf 
2(1 + **) 



m(/)=E/(l-E/)<2exp(- * ). 



Proof of Lemma \4-5\ Lemma 3.8 of [31] proved that if Mt is a median of ft 
then 

1/ < 2M\ 1+kt ' . 

Recall that f t = $ o vt and so M t = $(N t ). Suppose first that N t <0. Since 
< e~ x2 l 2 , we see that M t < e' N ^ 2 and so 

TV T 2 

E/<2exp(-- I (4.7) 

V 2(1 + ^) 2 / V ' 

On the other hand, if Nt > 0, we apply the preceding argument to 1 - / and 
we conclude that 

E(1 - /)S2eXP (-2(I^)- (48) 

Of course, max{E/, 1 - E/} < 1 and so we can combine (|4.7p and (1-4. 8(1 to 
prove the second claim of the lemma. □ 



Proof of Proposition \4-3[ We begin by applying the reverse-Holder inequal- 
ity (14. 6p to the equation in Lemma 12.21 

dRt P ( n v t +w t -2>PVtWt\Y 1113 z' , ,2r\ 1/r 

~dt 2 ( E ' eXP (" 2(1-/) )) ( E ' |V "' " V ""l ) 

(4.8) 

with f3 and r yet to be determined. Let us first consider the exponential 
term in f)4.9|) . Since 2|f£U'j| < v 2 + w 2 , we have 



/ „vf + wf - 2pvtWf \ „ i n vf + wf \ 



(2 2 \ V 2 

Eexp^jyEexp^^t-U , (4 . 10 ) 
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where we used the Cauchy-Schwarz inequality in the last line. Recall from 
Lemma 13.21 that vt and u>t are both fct-Lipschitz. Thus, we can apply 
Lemma IP1 with / = v t /k t and A = 2/3fc 2 /(l - p); we see that if A = 2/3£; 2 /(l - 
p) < |, then 

Eexp(/3^)<Ce AM S 
v 1-p' 

where Mj is a median of vt- Applying the same argument to wt and plugging 
the result into (|4.1(jp . we have 



E p e Xp ( ^ + ^: 2 ^ )<Ce^^ 2 ), 



,2 

'2(1-P 2 ) 

where Nf is a median of wt ■ Going back to (14. 9p , we have 

^ > _^ c -K^)(E p |V« t - V^| 2 f\ (4.11) 
at \/l-p 2 v 7 

with (recall) A = 2/3/c 2 /(l - p) < \\ hence, f3 < \{l - p)/k%. Recalling that 
i - -g = 1, we see that (|4.1ip holds for any r < j^yjrnjZpj • Finally, we invoke 
Lemma 14.51 to show that 

exp ( - -M 2 ) = exp ( - > (cE/(l - IE/)) ~ 

(and similarly for g and A^). Plugging this into (I4.1ip completes the proof. 

□ 



4.3 Proof of Proposition 14.11 

We are now prepared to prove Proposition 14. II by combining Proposition 14.31 
with Lemmas 13.21 and 14.21 Besides combining these three results, there is a 
small technical obstacle: we know only that the integral of is small; we 
don't know anything about a t specific values of t. So instead of showing 
that vt is close to linear for every t, we will show that for every t, there is a 
nearby t* such that Vt* is close to linear. By ensuring that t* is close to t, 
we will then be able to argue that vt is also close to linear. 

Proof of Proposition \4-l\ For any < r < 1 , Lemma 13.21 implies that 
/tp , .2r\l/r ( E I V ^- Vu» t | 2 ) 1/r 
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By Lemma H2] applied to v% and wt, if we set a - |(EVt>t + EVu>t) and we 
define e(vt) = ~E(vt(X) - (X,a) - Ei>) 2 (and similarly for e(wt)), then 

, 2(l-r)/r 

(e(«t) + < ^ (EpIVt* - V^| 2j -) 1/r . 

1-/3 



Now we plug this into Proposition 14.31 to obtain 

dt 

~oo dR. 



(e(v t ) + e(w t )) 1/r < C(t, p)m 2k t^ k ^ — . (4.12) 



Recall that S(f,g) = / °° -r- 4 c?s. In particular, 



dR t 

at mm 

t<s<t(l+a) (it 



< / — ds<5(f,g) 

Jt as 



and so there is some s e [t,t(l + a)] such that < If we apply this 

to ()4.12p with t replaced by s and with r = 1+4fc ^ (1 _ p) < i +4 fc2/(i_ p ) , we 
obtain 

e(u s ) + e(u/ s ) <C(t,p)m™%^ >(-)\ 

Since <3? is Lipschitz, if we denote E(/ s - <3?((X, a) -Kv s )) 2 by e(/ s ) (and 
similarly for g s ), then we have 



e(fs)+e(g s )<C(t,p)m r ^^(-) 



Note that r = T -^ ? > and so 

<fs) + e(g s ) < C(t,p)m^^(^) r . (4.13) 

Now we will need a lemma to show that e{ft) and e(/t) are small. We 
will prove the lemma after this proof is complete. 

Lemma 4.6. For any t < s and any h e L2(M. n ,j n ) , 

E(P t / l ) 2 <(E(P s / i ) 2 )* /s (E/ l 2 ) 1 ^ /s . 

To complete the proof of Proposition 14.11 apply Lemma 14.61 with h = 
f-P^ 1 mX,a)-Ev s ) (note that P s - 1 $((X,a) -Ev s ) exists by LemmaEH 
because \a\ < k s ). Since E/i 2 < sup \h\ < 1 and s < (1 + a)t, we see that 

e(/ t ) = E(P f h) 2 < (E(P s /i) 2 )' /s = e(f 8 )^ 1+a \ 
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Applying this (and the equivalent inequality for g) to (|4.13|) , we have 

e(f t ) + e{g t ) < C(t,p)^m ak ^^ 1+a U-)^, 

where e(ft) means K(ft-P~} t 3>((X, a)-Kv s )) 2 and similarly for e(gt). Since 
a < I - T+q - ^ anc ^ so we can arj sorb the power into the constant 

c(t, P ). a □ 

Proof of Lemma \4- 6[ Expand P s h in the Hermite basis as P s h = "E,b a Ha- 
Then 

E(P t / i ) 2 ^62 e 2( s - t )H 
By Holder's inequality applied with the exponents s/t and s/(s - 1), 

E(P t hf = Y, b a~ t)/Se2(S ~ t)lalbt l S 

= (m 2 f- t)/s (E(p s h) 2 ) t/s . □ 



5 Robustness: time-reversal 

The final step in proving Theorem 11.41 is to show that the conclusion of 
Proposition 14.11 implies that / and g are close to one of the equality cases. 
In [31], the authors used a spectral argument. However, that spectral ar- 
gument was responsible for the logarithmically slow rates (in 5) that |31j 
showed. Here, we use a better time-reversal argument that gives polynomial 
rates. The argument here will need the function / to take values only in 
{0, 1}. Thus, we will first establish Theorem 11.41 for sets; having done so, it 
is not difficult to extend it to functions using the equivalence, described in 
Section 11.41 between the set and functional forms of Borell's theorem. 

The main goal of a time-reversal argument is to bound E|/i| from above 
in terms of E|P^/i|, for some function h. The difficulty is that such bounds 
are not possible for general h. An illuminating example is the function 
h • K -*■ M given by h{x) = sgn(sin(/ca;)): on the one hand, E|/i| = 1; on the 
other, E|P(/i| can be made arbitrarily small by taking k large. 
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The example above is problematic because there is a lot of cancellation 
in Pfh. The essence of this section is that for the functions h we are inter- 
ested in, there is a geometric reason which disallows too much cancellation. 
Indeed, we are interested in functions h of the form 1a - 1b where B is a 
half-space. The negative part of such a function is supported on B, while 
the positive part is supported on B c . As we will see, this fact allows us 
to bound the amount of cancellation that occurs, and thus obtain a time- 
reversal result: 

Proposition 5.1. Let B be a half-space and A be any other set. There is 
an absolute constant C such that for any t > 0, 

-f(AAB) < Cmax{E|P t lA - P t l B \, (e 2t - I^Ve^U - P t l B \}, 

The main idea in Proposition 15. 11 is in the following lemma, which states 
that if a non-negative function is supported on a half-space then Pt will push 
strictly less than half of its mass onto the complementary half-space. 

Lemma 5.2. There is a constant c > such that for any b e R, if f : R n ->■ 
[0, 1] is supported on {x\ < b} then for any t > 0, 

E(P t f)l {Xl>e - tb} < max { V - c-^fiL, §E/}. 

Proof. Because Pt is self-adjoint, 



HPtf)l{Xr>e-tb} - EfP t l {Xl> _ e - tb} = E/d> 




Now, the set {b - E/ < x\ < b} has measure at most 4>(0)Kf. In particular, 
E/ Ve/^} < 0(O)E/ < |E/. 

Let A = [x\ < b - E/} and B = {b - Ef < xi < b} and recall that / is 
supported on {x\ < 6}, so that / = /(1a + 1b)- Now, 

VV^TTY/ [i X€B 

and so 
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There is a constant c > such that for all x > 0, <3?(-x) < max{| - cx, \ }. 
Applying this with x = , „{ , we have 

Ve 2 '-1 

» 4 Ef - EflA min l c 7fe' \) 5 max {^ E/ - C 7^T' H 

where in the last inequality, we recalled that E/1,4 > ^Ef '. □ 

Proof of Proposition 15, il Without loss of generality, B is the half-space 
{x\ < b}. Let / be the positive part of 1a - 1b and let g be the nega- 
tive part, so that ^y(AAB) = E/ + Eg. Note that / is supported on B c and 
g is supported on B. 

Without loss of generality, E/ > Kg; Lemma 15.21 implies that if E/ < 
CVe 2t - 1 then 

2E(l B P t f + \ B cP t g) < Ef + Eg - C M±S-. (5.2) 

ve 2t -l 



On the other hand, if Ef > CV e 2t - 1 then 



Thus, 



2E(l B Pt/ + l B °Pt9) < \^f + Eg< 7 -(Ef + Eg). (5.3) 



E\P t f - P t g\ = EP t f + EP t g - 2Emin{PJ, P t g} 
= Ef + Eg-2Emm{P t f,P t g} 
>Ef + Eg -2E(l B P t f + l B cP t g) 
(Ef + Eg) 2 Ef + Eg^ 



> mm i c- 



Where we have applied (|5.2p and (|5.3p in the last inequality. Now there are 
two cases, depending on which term in the minimum is smaller: if the first 
term is smaller then 

Ef + Eg < C{e 2t - lfl^E\P t f - P t g\; 

otherwise, the second term in the minimum is smaller and 

Ef + Eg <8E\P t f-P t g\. 

In either case, 

-f(AAB) = Ef + Eg < Cmax [E\P t f - P t g\,(e 2t - l^y/EjpJ - P t g\}, 
as claimed. □ 
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5.1 Synchronizing the time-reversal 

Proposition 15.11 would be enough if we knew that E(P^1^ - Ptls) 2 were 
small. Now, Proposition 14.11 and Lemma 13. II imply that K(PtlA ~ Pt+s^-B) 2 
is small, for some s > 0. In this section, we will show that if e~* = p then s 
must be small. Now, this is not necessarily the case for arbitrary sets A; in 
fact, for any s > one can find A such that E(P^1^4 - Pt+slfi) 2 is arbitrarily 
small. Fortunately, we have some extra information on A: we know that it 
is almost optimally noise stable with parameter p. In particular, if e~* = p 
then KIaP^a is close to El^i^l^. 

Using this extra information, the proof of robustness proceeds as follows: 
since KIaP^a is close to El^i^l^ and -PjIa is close to Pt+s^B, we will 
show that ElsPt+s^B is close to KIbP^b- But we know all about B: it 
is a half-space. Therefore, we can find explicit and accurate estimates for 
KlsPt+s^-B and El^P^l^ in terms of t, s and 7 n (£>); using them, we can 
conclude that s is small. Now, if s is small then we can show (again, using 
explicit estimates) that E(P t lB - Pt+s^B) 2 is small. Since E(P t lA- Pt+s^B) 2 
is small (this was our starting point, remember), we can apply the triangle 
inequality to conclude that ¥,(PtlA - P^b) 2 is small. Finally, we can apply 
Proposition 15.11 to show that E|1a - 1b | is small. 

Proposition 5.3. For every t, there is a C{t) such that the following holds. 
For sets A, A 1 c W 1 , suppose that B,B' c M. n are parallel half-spaces with 
^y(A) = 7(-B), 'y(A') = j(B'). If there exist s, e±, e 2 > such that 

E(P t l A -Pt +s lB) 2 <el 

and 

El A P t l A/ >El B P t l B ,-e 2 

then 

, 2 \i/2 ^ , N ei + e 2 



(E(P t l A -P t lB) 2 ) 1 <C(t)- 



(l( 1 (A))I( 1 (A'))) C(ty 

where I(x) = ^(Q' 1 (x)) . 

Rather than prove Proposition 15.31 all at once, we have split the part 
relating K(P t lB -Pt+s^s) 2 and Els(-Ptl_B' -Pt+ S ^B') into a separate lemma. 

Lemma 5.4. For every t there is a C{t) such that for any parallel half- 
spaces B and B' , and for every s > 0, 

fiBYPi pi ^ 1/2 , n(+\ e1 b(P^b' - Pt+slB') 

(E(P t l B - Pt+slB) ) <C{t)- — T-. 

(J(7(2?))J(7(B'))) () 
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Proof. First of all, one can easily check through integration by parts that 
for a smooth function / : R -> R, 



<j>{x)(Lf)(x)dx = -f'{b)<f>(b)- (5-4) 

b 



By rotating B and B' , we can assume that B = {x\ < a} and B' = {x\ < b}. 
Let F a 
by dS2 



Let F ab (t) = ~E1 BPt'i-B' = /a°° < ^( x ) ( ^ ) (^7f=5?) ^ x ano - consider its derivative: 



foo / p ^ x — h \ 

KM -I «x) M (- 7 =) (fa 
- -«^(£=) 

fct / a 2 + b 2 - 2e~ t ab \ 
= __ exp ^ _________ j 

k t I a 2 + b 2 \ 

_-_=*)■ 

Now, kt is decreasing in t and exp(-_/(l - e _2i )) is increasing in t. In 
particular, for any re [t, t + s], 

Hence, 

F ab (t) - F ab (t + s) > -smaxF^r) > ^ exp ( - ^rj)- (5.5) 

If s is large, this is a poor bound because skt+ s decreases exponentially in 
s. However, when s > 1 we can instead use 

F ab (t) - F ab (t + s) > F ab (t) - F ab (t + 1) > ^± exp ( - ^^)- (5-6) 

Equations (|5.5p and (|5.6p show that if KlsiPt^B' -Pt+s^-B') is small then 
s must be small. The next step, therefore, is to control E(Pj1b - Pt+^ls) 2 
in terms of s. Now, 

E(P t l B - P t+s l B ) 2 = E((P t ls) 2 + (PusIb) 2 ~ 2(P t l B )(P t+s l B )) 
= El B (P 2t l B + P 2(t+s) l B -2P 2t+s l B ) 
= (i_(2_) - F aa (2t + _)) - (F aa (2t + s) - F aa (2t + 2s)) 
<s(F' aa {2t)-F' aa {2t + 2s)), (5.7) 
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where the inequality follows because 

*M--£M-^)--|«p(-I7p) 

and so F' aa is an increasing function. To control the right hand side of (|5.7p . 
we go to the second derivative of F: 

F " {t) = 2vr(e 2i - 1)3/2 ex P ( " IT"; ) + 27r y^rTi (1 + e-*) 2 6XP ( " T + e 1 *) 

This is decreasing in i; hence 

E(P t l B - PusIb) 2 < s(F'(2b) - F'(2t + 2s)) < 2s 2 F"{2t). (5.8) 

We will now complete the proof by combining our upper bound on 
E(P t l B -PusIb) 2 with our lower bounds on El B (P t le/ - P t+s lB')- First, 
assume that s < 1. Then > and so (|5.5p plus (|5.8p implies that 



(E(P t l B - P i+ slB) 2 ) 1/2 < 27rexp(^^-) ^ 2 f //(2t) El B (P t l^ - Pus^B') 



1 2 



2-7T l-e- 



yj2F"(2t) m B {P t l B , - P f+ ,1 B Q 
(J( 7 (P))I( 7 (P')))^ 



If we take C(i) > max{ N /2F"(2t)/fc t+ i, 2/(1 - e~ 2 ')} then the Lemma holds 
in this case. On the other hand, if s > 1 then (j5.6j) implies that 

2^-1^ El B (P t l B ,-P + ,l B ,) >t 

(l( 7 (P))/( 7 (P')))^ " ' 

Since E(P t l B -Pt +S ls) 2 < 1 trivially, the Lemma holds in this case provided 
that C(t) > max{l/A; t+ i,2/(l -e~ 2t )}. □ 

Proof of Proposition 15.31 By the Cauchy-Schwarz inequality, 



EUPU < EUP t+s l B + VE(PU-P +S 1 B ) 2 < EUP t+s l B + ei. 

Moreover, El^P^l^ < El B P t+s l B since B is a super- level set of Pt+ S ^B 
with the same volume as A. Thus, 

El B P t l B -e 2 <El A P t U 

<El A P t+s l B + ei 
<El B P t + s lB + e h 
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By Lemma [5 

(E(p t i B - p t+s i B f) 112 < c(t)m B (p t i B - p t+s i B ) < c(t)( 

Finally, the triangle inequality gives 

(E(P t l A - P t l B ) 2 f 12 < (E(P t l A - P t+s l B ff 12 + (E(P t l B - P t+s l B ) 2 f 12 
<e 1 + C(t)(e 1 + e 2 ). 

Of course, 1 can be absorbed into the constant C{t). □ 



5.2 Proof of robustness 



Proof of Theorem \1.4\ First, define t by e -p. We then have k t = j 
and so the exponent of 5 in Proposition 14.11 becomes 

1 1 _ (l-p 2 )(l-p) 1 



p 



(5.9) 



1 + ^tt^tt^ 1 + a l-p + 3p 2 +p 3 1 + a 
(i-p 2 )(i-p) 

Of course, we can define a (depending on e and p) so that (j5.9j) is 

,= 11*4-4,. 
' 1 - p + 3p 2 + p 3 

Now suppose that f - 1a an d g = 1a' for some A, A' c ]R n . Proposi- 
tion 2J] implies that there are a e M n and 6 e R such that \a\ < kt and 

E((P t l A )(X) - $((a,X) -fe)) 2 < C(p,e)m c ^5^. 

Since \a\ < kt, Lemma [3. II implies that we can find some s > and a half-space 
B such that ®((a,x) - b) = (P t+s l B )(x); then 

E(P t l A -Pt +s lB) 2 <C{p,e)m c ^5\ (5.10) 

At this point, it isn't clear that 7(^4) = j(B); however, we can ensure this 
by modifying B slightly: 

E{P t l A - P t+S 1 B ) 2 > (EP t l A - EP t+s l B ) 2 = ( 7 (A) - -/(B)) 2 . 

Therefore let B be a translation of B so that ^y(B) = j(A). By the triangle 
inequality, 

{E(P t l A - Pt +S 1 B ) 2 ) 1/2 < {E(P t l A - PusIb) 2 ) 1 ' 2 + {E(P t+s l B - Pt +S 1 B ) 2 ) 1/2 

< (E(P t l A - P t+S 1 B ) 2 ) 112 + | 7 (B) - 7 (5)| 1/2 
<2{E{P t lA-Pt +s l B ) 2 ) 112 . 
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By replacing B with B, we can assume in (|5.10p that ^y(A) = 7(-B) (at the 
cost of increasing C(p,e) by a factor of 2). 

Now we apply Proposition 15.31 with e 2 - C(p, e)m c ^8 TI and e 2 = 8. The 
conclusion of Proposition 15.31 leaves us with 

(E(P t l A -P t l B ) 2 ) 1/2 < C(p,e)m c ^(e 1 +e 2 ) 
<C(p,e)m c{p) 5^ 2 . 

where we have absorbed the constant C(t) from Proposition 15. 1 1 into C(p,e) 
and c(p). Since E|X| < (EX 2 ) 1 / 2 for any random variable X, we may apply 
Proposition 15.11 

j(AAB) < C(pWE\P t l A -P t l B \ 

<C(p){E(P t l A -P t l B ) 2 f 4 
<C(p,e)m c{p) 5 r ' / \ 

By applying the same argument to A' and B', this establishes Theorem 11.41 
in the case that / and g are indicator functions. 

To extend the result to other functions, note that EJ(f(X),g(Y)) = 
E J( 1 A (X), 1 A '(Y)) where X and Y are p-correlated Gaussian vectors in 
R n+1 , and 

A = {(x,x n+1 ) e R n+1 : x n+1 > <S>-\f(x))} 
A' = {(x,x n+1 ) e R n+1 : x n+1 > ^(gix))}. 

Moreover, E/ = 7„ + i(j4) and Eg = 7„ + i(j4'). Applying Theorem 11.41 for 
indicator functions in dimension n + 1, we find a half-space B so that 

7 n + i(^AS) < C(p, e)m c ^ W 4 . (5.11) 

By slightly perturbing B, we can assume that it does not take the form 
{xi > b} for any 1 < i < n; in particular, this means that we can write B in 
the form 

B = {(x, x n +l ) e R n : x n+ i > (a, x)-b}. 
for some oeK n and b e R. But then 

7„ + i(AA J B) = E|/(X) - $((a,X> - 6)|; 

combined with (|5,lip . this completes the proof. □ 
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6 Optimal dependence on p 



In this section, we will prove Theorem II .51 To do so we need to improve the 
dependence on p that appeared in Theorem 11.41 Before we begin, let us list 
the places where the dependence on p can be improved: 

1. In Proposition I4.3) we needed to control 

v 2 (X) + w 2 (Y) -2pv t (X)w t (Y)s 



E p exp(£- 



2(1 -p 2 ) r 



Of course, the denominator of the exponent blows up as p -*■ 1. How- 
ever, if vt = wt then the numerator goes to zero (in law, at least) at 
the same rate. In this case, therefore, we are able to bound the above 
expectation by an expression not depending on p. 

2. In the proof of Proposition 14.11 we used an bound on | Vft| and 
\Vwt\ to show that for some r < 1, 

E p (\Vv t (X) - \7w t (Y)\ 2 ) 1/r < C(t)E p (\Vv t (X) - Vw t {Y)\ 2r f lr . 

This inequality is not sharp in its p-dependence because when vt = wt, 
the left hand side shrinks like (1 - p) l ^ r as p -*■ 1, while the right hand 
side shrinks like 1 - p. We can get the right p-dependence by using 
an L p bound on \\Jvt{X) - X7vt(Y)\ when applying Holder's inequality, 
instead of an bound. 

3. In applying Proposition 15.31 we were forced to take = p. Since 
most of our bounds have a (necessary) dependence on t, this causes a 
dependence on p which is not optimal. To get around this, we will use 
the subadditivity property of Kane [20J, Kindler and O'Donnell [25] 
to show that we can actually choose certain values of t such that e~* 
is much smaller than p. In particular, we can take t to be quite large 
even when p is close to 1. 

Once we have incorporated the first two improvements, we will obtain a 
better version of Proposition 14.11 

Proposition 6.1. For any a,t>0, there is a constant C(t,a) such that for 
any f ■ M. n -> [0, 1], there exist a e M. n , b e R with \a\ < kt such that 

E(f t (X) - *((X,a) - b)f < C(t,a)m^~ a (—^==) T ^~°'. 
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where k t = {e 2t - l)' 1 ' 2 , 5(f) = E p J(f(X),f(Y)) - J(E/,E/), and m(f) = 
E/(l-E/). 

Moreover, this statement holds with a C(t,a) which, for any fixed a, is 
decreasing in t. 

Once we have incorporated the third improvement above, we will use the 
arguments of Section [5] to prove Theorem 11.51 

6.1 A better bound on the auxiliary term 

First, we will tackle item 1 above. Our improved bound leads to a version 
of Proposition 14.31 with the correct dependence on p. 

Proposition 6.2. Let kt = (e 2t - l) -1 / 2 . There are constants < c, C < oo 
such that for any t > 0, if r < 1+ g fc 2 then 

^ > -^(cm(f))^y 2 (nw t (X) - W t (Y)\ 2 f r 
dt yjl-p 2 V ' 

where m(f) = E/(l - E/). 

To obtain this improvement, we note that for a Lipschitz function v, 
(v(X) - ^^(y))/^yl - p satisfies a Gaussian tail bound that does not depend 
on p: 

Lemma 6.3. If v- W 1 -*■ K is L- Lipschitz then 

Pr p (v(X) - v(Y) > Lsy/2(l-p)) < 1 - $(s). 

Ln particular, if A(3L 2 < 1 then 



' y/1 - 4/3L 2 

Proof Let Z x = and Z 2 = so that EZ 2 = ±±£ and EZ$ = Now 

we condition on Z\\ the function v(Z\ + Z 2 ) -v(Z\ - Z2) is 2L-Lipschitz in 
Z2 and has conditional median zero (because it is odd in Z 2 ); thus 

Pr p [v(Z! + Z 2 ) - v(Z x - Z 2 ) > Lsy/2(1 - p)\Zi) < 1 - 

Now integrate out Z\ to prove the first claim. 

Proving the second claim from the first one is a standard calculation. □ 
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Next, we use the estimate of Lemma 16.31 to prove a bound on 

p6P ^ 2(1 -p2) / 

that is better than the one from (|4.10p which was used to derive Proposi- 
tion ED 



(3 > wt/i 6/3fc t 2 < 1, 



Ep exp (^MijghM) < Ce «w. 



Lemma 6.4. There is a constant C such that for any t > 0, and for any 

2 

t 

2(1 -P 2 ) 

where M t is a median of vt ■ 

Proof. We begin with the Cauchy-Schwarz inequality: 
p6P ^ 2(1 -p2) i 

^M^^^^fH^f- (6l) 

Now, recall from Lemma f3.2l that u 4 is /c t -Lipschitz. In particular, Lemma [6. 3 1 
implies that if 8/3k 2 < 1 then the first term of (|6.ip is at most %/2. Finally, 
Lemma [4. 41 implies that the second term of (|6,ip is bounded by Ce Mt I 2 . □ 

Proof of Proposition [b\M First, follow the proof of Proposition 14.31 up un- 
til (|4.9p . At this point, we can apply Lemma 16.41 to obtain 

dR t >c . P -M?I2{^ ,„.. , v s ,w2r\ 1/r 



:e M ?/ 2 (E p \Vv t (X)-W t (Y)\ 2r ) 



dt ~ yiy 

and we conclude by applying Lemma l4.5[ which implies that 

e M ^ 2 >(cm(/))( 1+b ' 2 . □ 
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6.2 Higher moments of \Vv t (X) - Vv t (Y)\ 

Here, we will carry out the second step of the plan outlined at the beginning 
of Section [6l The main result is an upper bound on arbitrary moments of 
\Vvt(X)-Vvt(Y)\. 

Proposition 6.5. There is a constant C such that for any t > and any 

1 < q < oo, 

(E p \Vv t (X) - V« t (F)| 9 ) 1/9 < Ck 2 ^qJT ~pj((l + fct)>/log(l/ro(/)) + V«fct). 

If we fix q and t, then the bound of Proposition 16.51 has the right de- 
pendence on p. In particular, we will use it instead of the uniform bound 
|Vwt| ^ kt, which does not improve as p -*■ 1. 

There are two main tools in the proof of Proposition 16.51 The first is 
a moment bound on the Hessian of vt, which was proved in |31j. In what 
follows, || • \\p denotes the Frobenius norm of a matrix. 

Proposition 6.6. Let Hvt denote the Hessian matrix of Vt- There is a 
constant C such that for all t> and all 1 < q < oo, 



(E\\Hv t \\ q F ) 1/q < Ckl[{\ + kt)J]og ^ + Jqh) 

The other tool in the proof of Proposition 16.51 is a result of Pinelis |34] , 
which will allow us to relate moments of \Vvt(X) - Vvt(Y)\ to moments of 
\\Hv t \\ F . 

Proposition 6.7. Let h : M. n -> W k be a C 1 function and let Dh be the nx k 

matrix of its partial derivatives. Lf Z\ and Z2 are independent, standard 
Gaussian vectors in M. n then 

(E\h(Z x ) - h(Z 2 )\i) 1/q < C^iEWDhWl,) 1 ^ 

for every 1 < q < 00 , where C is a universal constant. 

Proof. Define / : R 2n R k by f(Z) = h(Z x ) - h(Z 2 ) where Z = (Z 1 ,Z 2 ). 
Pinelis [M] showed that if Vl/ : M. k -> R is a convex function then for any 
function / : R 2n -> R k with Ef = 0, 

E^(f(Z))<E^Df(Z)-z), 
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where Z is an independent copy of Z . Applying this with ^(x) = \x\ q , and 
noting that Df = ( q -1 ) ® Dh, we obtain 

E\f(Z)\ q <C q E\Dh{Z x )-Z 2 \ q . 

Now, < (C^/q) q l 2 \A\ F for any fixed matrix A; if we apply this fact 

conditionally on Zi, then we obtain 

E\f(Z)\ q <(C^q) q M\\Dh\\ q F . □ 

Proof of Proposition 16,51 Let Z, Z\ and Z 2 be independent standard Gaus- 
sians on M n ; set X = ^fpZ + a/1 - pZ\ and Y = ^fpZ + \J\ - pZ 2 so that X 
and Y are standard Gaussians with correlation p. Conditioned on Z, define 
the function 

h{x) = W t (Vz + yjl-px), 
so that h(Zx) = W t (X) and h(Z 2 ) = V^tOO- Note that 



(Dh)(x) = ^T~p(Hv t )(^-pZ + v^x); 
thus Proposition 16.71 (conditioned on Z) implies that 

EflvvtPO - V«t(y)|' I Z) < {C^^p)) q JL(\\Hv t {X)\\ F I Z). 
Integrating out Z and raising both sides to the power 1/q, we have 

(E\W t (X) - W t (Y)\ q f /p < C^qJl^)(E\\Hv t \\ F ) 1/q . 

We conclude by applying Proposition 16.61 to the right hand side. □ 

With the first two steps of our outline complete, we are ready to prove 
Proposition 16. 11 This proof is much like the proof of Proposition 14. 1\ except 
that it uses Propositions 16.21 and 16.51 in the appropriate places. 

Proof of Proposition \6.1\ For any non-negative random variable Z and any 
< a < 2, 0<r<l, Holder's inequality applied with p = 2r/7 implies that 

EZ 2 _ E ^7^2- 7 < ^ EZ 2r^7/(2r)^ EZ 2r(2-7)/(2r-7)^(2r-7)/(2r)^ 

In particular, if we set q = 2r(2 - 7)/(2r - 7) then we obtain 

\2/ 7 



(EZ 2r ) 



2r ,l/r ^ I EZ 2 



> 



(EZ<?) 



(2-7)/9 



•2) 



/ 
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Now, set Z = \Vv t (X) - Vv t (Y)\, a = E\7v t and e(^) = E(v t (X) - {X,a} - 
Kvt) 2 . Lemma |4,2I and Proposition 16.51 then imply that the right-hand side 
of (16.21) is at least 



2(l-p)e(t*) 



\ 



2/0 



/ 

= cy/l-p 



2/7 



^ (^((1 + fet)Vlog(l/m(/)) + N /9 fe 0) 2 " 7 , 

Now define r/ = 8&f/(l + 8&f) and choose r = 1 - 77 (so as to satisfy the 
hypothesis of Proposition 14. 3p . If we then define 7 = 2r - ar/ = 2 - (2 + a)r/ 
for some < a < 1, we will find that q - 2r^±H < 6/a. In particular, the last 
displayed quantity is at least 



{l-p){ca)^ )h 



s(vt) 2h 



((fcf + l)Vlog(l/m(/))) 



(2- 7 )/ 7 



bmce (&! + 1)( 2 -t)/t depends only on t, we can put this all together (going 
back to (16.21) ) to obtain 



(E|V«*(X) - V^(y)| 2r ) 1/r > c(t,a)(l - p) 



= c(t, a)(l - p) 



s(v t ) 2 h 



log c (*)(l/m(/)) 



e(^) 1 ' 



log C W(l/m(/))' 



Combined with Proposition 16. 2\ this implies 

m(/) 



— — > c(t)pW l-p r , u , 
dt JHV P log c W(l/m(/)) 



(l+k t ) 2 1+8fc t 

e{v t y- iak 'i 



> c 



(t,a) PN /Wm(/)( 1+fci ) +a e{v t y- 



(6.3) 



where the last line follows because for every a > and every C, there is a 
C'(a) such that for every x<\, log (1/x) < C'(a)x~ a . Now, with (|6.3p as 
an analogue of (|4.12p . we complete the proof by following that of Proposi- 
tion 16.11 Let us reiterate the main steps: recalling that 6 = / °° ^jj- ds, we 
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see that for any a, t > 0, there is some s e [t, t(l + a)] so that < ^. 



By (|6.3|) applied with t = s, we have 



(l+fcj) 2 (l-4afcg) 



Now, note that $ is a contraction, and so Lemma 14.61 implies that 

,2 



E(/ t (X)-P;_ 1 t $((X ) EVi; s >-E Us )) < 

<C(t,a)m 1+ak t " ~ a ( ° )~ 



(l+fc t ) 2 (l-4gfcf) 



By changing a and adjusting C(i, a) accordingly, we can put this inequality 
into the form that was claimed in the proposition. 

Finally, recall that \EVv s \ < k s by Lemma [3T2| and so P~} t $((X,EVv s ) - 
Ev s ) can be written in the form <&({X, a) - b) for some a e R n , b e R with 
lal < fct. □ 



6.3 On the monitonicity of 5 with respect to p 

The final step in the proof of Theorem 11.51 is to improve the application of 
Lemma 15.41 Assuming, for now, that / is the indicator function of a set A, 
the hypothesis of Theorem 11.51 tells us if e - * = p then KIaP^a is almost as 
large as possible; that is, it is almost as large as KlsPt^B where B is a half- 
space of probability Vi{A). This assumption allows us to apply Lemma l5.4l 
but only with t = log(l//}). In particular, this means that we will need to 
use this value of t in Proposition 16. 1\ which implies a poor dependence on p 
in our final answer. 

To avoid all these difficulties, we will follow Kane [20] and Kindler and 
O'Donnell [25] to show if El^P^l^ is almost as large as possible for t = 
log(l//3), then it is also large for certain values of t that are larger. 

Proposition 6.8. Suppose icR" has Pr(^4) = 1/2. If 9 - cos(k cos -1 p) 
for some fceN, and 

J(l/2, 1/2; p) - E p J(l A (X), l A (Y);p) < 5 

then 

.7(1/2, 1/2; 0) -E e J(l A (X),l A (Y);9) < k5 
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Proof. Let Z\ and Z2 be independent standard Gaussians on K n and define 
Z(^) = Z\ COS7 + Z2 sin7. Note that for any 7 and any j e N, Z((j + 1)7) 
and Z(jj) have correlation cos 7. In particular, if 7 = cos _1 (p)> then the 
union bound implies that 

Pr 9 (X e A, Y { A) = Pr(Z(0) e A, Z(by) f A) 

k—1 

< X)Pr(Z07)6A,Z(0- + l)7)M) 
3=0 

= A;Pr p (X 6 A,F ^ A). (6.4) 

The remarkable thing about this inequality is that it becomes equality when 
A is a half-space of measure 1/2, because in this case, Y > r p {X € A,Y £ A) = 

3^ cos-Hp). 

Recall that E p J(U(X), 1aQ0;p) = Pr p (X e A,Y e A). Thus, the hy- 
pothesis of the proposition can be rewritten as 

which rearranges to read 

PrJX £A,Y{A)<5 + — cos' 1 p. 

Z7T 

By (|6.4p . this implies that 

Pr e (X e A, y f! A) < fe<5 + — cos" 1 9, 

which can then be rearranged to yield the conclusion of the proposition. □ 

Let us point out two deficiencies in Proposition 16.81 the requirement 
that Pt(A) = 1/2 and that k be an integer. The first of these deficiencies is 
responsible for the assumption E/ = 5 in Theorem 11.51 an d the second one 
prevents us from obtaining a better constant in the exponent of 5. Both of 
these restrictions come from the subadditivity condition (|6.4p . which only 
makes sense for an integer k, and only achieves equality for a half-space 
of volume |. But beyond the fact that our proof fails, we have no reason 
not to believe that some version of Proposition 16.81 is true without these 
restrictions. In particular, we make the following conjecture: 

Conjecture 6.9. There is a function k(p, a) such that 

• for any fixed a e (0, 1), k(p, a) ~ \/l - p as p -*■ 1; 
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• for any fixed a e (0, 1), k(p, a) ~ p as p -> 0; and 

• /or any a 6 (0, 1) and any ^4 c R n i/ie quantity 

J(a,a;p) - E p J(l A (X),l A (Y);p)) 
k(p, a) 

is increasing in p. 

If this conjecture were true, it would tell us that sets which are almost 
optimal for some p are also almost optimal for smaller p, where the function 
k(p,a) quantifies the almost optimality. 

In any case, let us move on to the proof of Theorem 1 1,51 If the conjecture 
is true, then the following proof will directly benefit from the improvement. 

Proof of Theorem \l.b\ We will prove the theorem when / is the indicator 
function of a set A. The extension to general / follows from the same 
argument that was made in the proof of Theorem 11.41 

Fix e > 0. If po is close enough to 1 then for every po < p < 1, there is 
a k e N such that fccos _1 (p) e [| - e, | - |]. In particular, this means that 
cos(A;cos _1 (p)) e [ci(e), 02(e)], where ci(e) and 02(e) converge to zero as 
e -*■ 0. Moreover, this k must satisfy 



cos _1 (p) " V 1 - P 
Now let 9 = cos(/ccos~ 1 (yo)). By Proposition 16. 8\ A satisfies 

6 



J(l/2,l/2;e)-M e J(l A (X),l A (Y);9) < C(e) 



Now we will apply Proposition 16.11 with p replaced by 9 and t = log (1/0). 
Since 9 < c 2 (e), it follows that k t = 0/Vl - 9 2 < 03(e) (where 03(e) -> with 
e). Thus, the conclusion of Proposition 16.11 gives us a e M n , b € R such that 

^«)(^)'-"" ) (6.5) 

Now we apply the same time-reversal argument as in Theorem 11.41 
Lemma 3.1 implies that there is some s > and a half-space B such that 



E(P t l A - P t+s l B f < C(e)(S/y/T^p) 



l-c 4 (e) 
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and we can assume, at the cost of increasing C(e), that Pr(B) = Pr(^4). 
Then Proposition 15.31 implies that 



E(P t l A - P t l B f < C(e)(5/y/jTp) 



l-c 4 (e) 



and we apply Proposition 15.11 (recalling that t is bounded above and below 
by constants depending on e) to conclude that 



Pi(AAB)<C(e)(6/y/T^) 



l/4-c 4 (e)/4 



Recall that 04(e) is some quantity tending to zero with e. Therefore, we 
can derive the claim of the theorem from the equation above by modifying 
C(e). □ 

Finally, we will prove Corollary 11.91 

Proof of Corollary 1 1.91 Since xy < J(x,y), the hypothesis of Corollary 11.91 
implies that 

EJ(f(X)J(Y)) >\ + ^ arcsin(p) - 5. 

Now, consider Theorem 1 1 . 5 1 wit h e = 1/8. If p > po then apply it; if not, apply 
Theorem [T31 In either case, the conclusion is that there is some a e M. n such 
that 

E\f(X)-mX,a))\<C(p)6 c . 
Setting g(X) = Q((X, a}), Holder's inequality implies that 

\Eg{X)g{Y) - Ef(X)f(Y)\ = \E(g(X) - f(X))g(Y) + Ef(X)(g(Y) - f(Y))\ 

<2E\f-g\. 



In particular, 



MX)g(Y) > - + ±- arcsin(p) - 5 - C(p)6 c . (6.6) 

4 Z7T 



But the left hand side can be computed exactly: if \a\ - (e 2 * - 1) 1 I 2 and 
A = {x e R n : xi < 0} then 

Eg{X)g{Y) = EP t l A (X)P t l A (Y) 

= El A (X)P 2t _ lo%{p) l A (X) 

1 1 - / -it s 
= — 1 arcsmte p) 

1 1 ■ / \ l/i — 2t\ 
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where the last line used the fact that the derivative of arcsin is at least 1. 
Combining this with (|6.6p . we have 

1 - e- 2t < C(p)5 c (6.7) 

On the other hand, 

E\g - l A \ = 2(1/2 - Egl A ) = - - - arcsin(e"*) < Vl-e~ 2t , 

2 7T 

which combines with (|6.7p to prove that M\g - 1 A \ < C{p)5 c . Applying the 
triangle inequality, we conclude that 

nf-i A \<nf- g \ + n g -i A \<c{p)5 c . □ 

7 The robust "majority is stablest" theorem 

In this section, we prove Theorem ll.lOi For the rest of this section, we set £ 
and a to be uniformly random elements in {-1, l} n satisfying E£j<jj = p for 
all i. 

We begin the proof of Theorem 11.101 by recalling some Fourier-theoretic 
properties of {-1, l} n . For S c [n], define xs '■ l} n ~^ 1} by Xs( x ) = 
T\i £ S x i- Then {xs '■ S c [n]} form an orthonormal basis of L 2 ({-l,l} n ). 
We will write fs for the coefficients of / in this basis; that is, 

f(x)= E hxs(x). (7.1) 

Sc[n] 

Recall the Bonami-Beckner semigroup Qt defined by 

(Qtf)(0=®eAf(*)\0, 

and denote Qtf by ft) then 

S p (/)=E A ,/(0/(ff)=E// ]og(1/p) . 

7.1 The invariance principle 

Note that any function / : {-l,l} n -> M. can be extended to a multilinear 
function from on IR n through the Fourier expansion f|T. 1|) : since Xs( x ) 1S 
defined for all x e M n , we may define g(x) for x e M n by <7(x) = T,s fsXs(x)- 
We will say that g is the multilinear extension of /; note that g and / agree 
on {-1, l} n , thereby justifying the term "extension." 
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Let us remark on some well-known and important properties of multlin- 
ear polynomials. First of all, since E^ = EXi = and E£ 2 = EXf = 1, it is 
trivial to check that for multlinear functions / and g, 

E/(0 = E/(X) 

E/ 2 (0 = E/ 2 pT) 

E p f(t)g(<T) =Kf(X)g(Y). 

It is also easy to check that if / is a multilinear polynomial then for any t > 0, 
Qtf and Ptf are the same polynomial. In particular, there is no ambiguity 
in using the notation f t for both P t f and Qtf- 

Despite these similarities, g{X) and can have very different distri- 
butions in general (for example, if g{x) - x\). The main technical result 
of [32j is that when / has low influence and t > 0, then ft(X) and /t(£) have 
similar distributions. We will quote a much less general statement then the 
one proved in [32J, which will nevertheless be sufficient for our purposes. In 
particular, we will only need to know that if g(£) takes values in [0, 1], then 
g(X) mostly takes values in [0, 1]. Before stating the theorem from [32], let 
us introduce some notation: for a function / taking values in R, let / be its 
truncation which takes values in [0, 1]: 

'0 if/(x)<0 
f(x) = • f(x) if < f{x) < 1 
.1 ifl</(x). 

Theorem 7.1. Suppose f is a multilinear polynomial such that /(£) € [0, 1] 
for all £ e {-1, 1}™. If f satisfies maxj Infj(/) < r then for any rj > 0, 

E(f v (X)-J n (X)) 2 <Cr^ (7.2) 

We will now use Theorem 17.11 to prove Theorem 11.101 First, (17.21) and 
the triangle inequality imply that for any < p' < 1, 

E p ,f v (X)f v (Y) < E^(X)^(F) + Ct c \ (7.3) 

Now, 

E p ,f v (X)f v (Y) = E p ,f v (Of v (o-) = E e - 2v /(£)/(a). (7.4) 

If we set p' = e 2v p (assuming that r\ is small enough so that e 2v p < 1) 
then (17. 3 p , (|7.4j) , and the assumption (|1 .6H of Theorem 11.101 imply that 

E p ,Tr,{X)J n {Y) > J(E/, Eg; p) - Cr cr ' - e 

> J(E^,E^;p)-CT c "- e 

> J(Ej^,EUp')-C(p) V -Cr^-e, 
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where the second inequality follows because (by (|7.2p ) \Ef-Kf v \ < Ct c and 
d^daf'^ * s bounded. Applying Corollary 11.91 (with p' in place of p) to f v , 
we see that there are a, b e M n such that 

E(f v (X) - l {<a ,x-6),o}) 2 < C(p)( V + r c " + e) c . 

By (|7.2p and the triangle inequality, we may replace f n by f v : 

E(f v (X) - l {{a , X - b )>_o}) 2 < C(p)( V + r cr > + e) c . (7.5) 

The next step is to pull (|7.5p back to the discrete cube by replacing X 
by £. We will do this using Theorem l7.ll As a prerequisite, we need to show 
that l{( a ,x-b)>o} has small influences; this is essentially the same as saying 
that a is well-spread: 

Lemma 7.2. There is ana € M n satisfying (I7.5P with £ a? = 1 and maxj |oj| < 
Cr c . 

Once we have shown that 1{/ O ,a;-6)>0} has small influences, we can use 
Theorem 17. II to show that the multilinear extension of ^{{ a ,x-b)>o} is close to 

^{{a,x-b)>0}'- 

Lemma 7.3. Let g a,b be the multilinear extension of the function x >-> 
l{{a,ic-b)>o} • V T,i a j = 1 a ^ maxj \a,i\ < r t/ien /or any r\ > 0, 

E(^ 6 (X)-l {(a , X _ 6) , 0} ) 2 <C(7 7 +T CT '). 

From Lemma f7. 3 1 and the triangle inequality, we conclude from (17. 5p that 

!(/„(*) - <?f(*)) 2 < C(p)(7, + r c " + e) c . 

Since f v - g^ ,b is a multilinear polynomial, its second moment remains un- 
changed when X is replaced by £: 

E(/,(0-^(0) 2 <C(p)(n + T c " + e r. 

Now, g a,b is the indicator of a half-space on the cube; thus, E(g^' b (£) - 
<7 ' 6 (£)) 2 < Crf (see, for example, |4J). Applying this and the triangle in- 
equality, we have 

E(/„(0 - <? a ' 6 (0) 2 < C(p)(7? + r c " + e) c . (7.6) 

The last piece is to replace /„ by /. We do this with a simple lemma which 
shows that for any function /, if f v is close to some indicator function then 
/ is also close to the same indicator function. 
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Lemma 7.4. For any functions f : {-l,l} n -*■ [0,1] and g : {-l,l} n ->■ 
{0, 1} and any r\ > 0, 

WXO-g(0) 2 <CE(f v (0-g(0) 2 - 

Applying Lemma 17.41 to (|7.6p . we obtain 

nf{0-9 a '\i)) 2 <C{p){v + r^ + ey. 

By choosing r and rj small enough compared to e, the proof of Theorem ll.101 
is complete, modulo the proofs of Lemmas 17.21 17.31 and 17.41 We will prove 
them in the coming section. 

7.2 Gaussian and boolean halfspaces 

Here we will prove the lemmas of the previous section. Before doing so, 
let us observe that E-X'il{( a ,x-6>>0} * s proportional to a^, a fact which has 
already been noted by Matulef et al. [30J: 

Lemma 7.5. 

EXil {<a)X _ b )> 0} = ai0((a,&)). 

Proof. Let e% e M. n be the vector with 1 in position i and elsewhere. We 
may write ej = aja+a 1 , where a 1 is some element of R n which is orthogonal to 
a. Note that (X, a 1 ) is independent of (X, a) and so K(X, a 1 }l{( a ,x-6>>o} = 0- 
Hence, 

KX d{(a,X-b}>0} = ai E (a,^)l{( a ,X-6}>0} = 0-.jW.Xi l{X 1 >{a,b)} , 

where the second inequality follows because, by the rotational invariance 
of the Gaussian measure, (a, X) has the same distribution as X\. Finally, 
integration by parts implies that 'KXilsx 1 >t a ,b)\ ~ ^(( a )&))- d 

Next, we prove Lemma 17.21 The point is that if a halfspace is close to a 
low-influence function / then that halfspace must also have low influences. 
We can then perturb the halfspace to have even lower influences without 
increasing its distance to / by much. 

Proof of Lemma \7.S\ Suppose that / has influences bounded by r, and that 

E(/(X)-l {<aiX _ 6) , 0} ) 2 < 7 , (7.7) 
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where 7 = C(p)(rj + T c ' n + e) c . We will show that there is some a such that 
Y,i af = 1, maxj < Cr c , and 

E(/(X)-l { ^ x _ fe>£0} ) 2 < 7 c . (7.8) 

When applied to the function , this will imply the claim of Lemma 17.21 

Since the influences of / are bounded by r, it follows in particular that 
\f{i}\ < r for every i. On the other hand, Xi form an orthonormal sequence 
and so 

E(/(X) - l {(a ,x-6>>o}) 2 > £ {^Xif(X) - EXtl 

{{a,X-b)>0} ) 

i 

= E(/«-M>(M>)) 2 > (7.9) 

i 

where the equality used Lemma 1731 Defining = 4>((a, b)), it follows that 
for any i with la^K^f, > Cr, we have (fu\ - ciiK a ^ 2 > ca 2 K 2 b . Combining 
this with (H2D and (|7H . 

J>E(f(X)-h{a,x- b )>o}) 2 >CKl b £ a l ( 7 - 10 ) 

i-\a,i\K a ^ b >CT 

for every i. 

We consider two cases, depending on whether K a ^ is large or small. 
First, suppose that K a:b < 7 1 / 3 . Now, K a:b > cPr(Xi > (a, b)) = El{( o ,x-&)>0}> 
while (|7.7p implies that 

E/ < + El {<a ,x-6>>o} < ^ + C 7 1/3 < C 7 1/3 . 

Since / takes values in [0, 1], it follows that / is close to the zero function; 
in particular, any half-space with small enough measure will satisfy (|7.8|) . 

Now suppose that K a ^ > 7 1 / 3 (which is in turn larger than r 1 / 3 ); then (|7.10p 
implies that 

E «? * C^l\ 

i:\ai\>Cr 2 ^ 

If we define a to be the truncated version of a (i.e. en = a-i if |aj| < 
and a,i - otherwise), then this implies that \a - a\ 2 < C7 1 / 3 . Moreover, 
\a\ 2 > 1 - C7 1 / 3 , which implies that we can normalize a so that \a\ = 1, 
while preserving the fact that \a - a\ 2 < C7 1 / 3 and maxj a, < C7 1 / 3 . Finally, 
\a - a\ 2 < C7 1 / 3 implies that 

E (l{(a,X-6)} " i^a.X-fc)}) 2 ^ C7 C - 

By the triangle inequality and (17. 7p . (17.81) follows. □ 
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Next, we will prove Lemma 17, 3t if 

g a,b ig 

the linear extension of a low- 
influence halfspace, then g a,b is close to a halfspace. Observe that this is 
very much not the case for general halfspaces: the linear extension of l Xl >o 
is x\, which is not close, in L2(M n ,7 n ), to any halfspace. 

Proof of Lemma \7.S\ The proof rests on the invariance principle (Theo- 
rem !7,ip . Let g be the linear extension of l{( a ,x-b}>o} an d let h(x) = (a,x-b). 
First of all, the Berry-Esseen theorem implies that for any M > 0, 

Eg(OHO=®HOh(0>o 

Pr((a,x) > t) dt 

(0,6) 

rM 

> / Pr((a,x) > t) dt 

J{a,b) 
•M 



r 1V1 

> / PriXt >t) dt- CMt 

J{a,b) 

> f°° Pr(Xi > t) dt - CMt - Ce~ M2/2 . 



l(a,b) 

Choosing M = >/log(l/r), we have 

Jr 00 
Pr(Xx >t)dt- Ct c = El {(atX _ b)>0} h(X) - Ct c . (7.11) 
(a,b) 

Now, h is linear and so ht = e~ l h\ since is self-adjoint, we have 



= e' 7 E^(X)/ l (X) 

< e"E^(X)/i(X) + Ce^ry + r c ") 

where the last inequality assumes that rj < 1 (if not then the lemma is trivial 
anyway). Combining this with ()7.11[) . 

^{(a,X-b)>o}h(X) < Eg~(X)h(X) + C(rj + r c "). (7.12) 

Now, let m(X) = l{( a ,x-b)>o} ~9rj(X) and take e = E|m|; note when m*0 
then m and h have the same sign. Let A = {x : (a,X-b) e [-e/2,e/2]}. Then 
Pr(j4) < e/2, and since |m| < 1 we must have E|m|l^c > E|m| - Pr(yl) > e/2. 
But on A c we have |/i(x)| > e/2; since the signs of m and /i agree, 

e e 2 

Em(X)/i(X) > Em(I)/i(I)l {X£j4 c } > -E|m|l Ac > — . 
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Applying this to ()7.12j) yields e < C{r\ + t cv ) c . So if we recall the definition 
of e, then we see that 

ni {{a ,x-b)>o } -^(*)| < + r cr >y. 

By changing the constant c, we may replace E| ■ | with E(-) 2 ; by (|7.2p . we 
may replace g^ by g„. This completes the proof of the lemma. Note that 
the only reason for proving this lemma with instead of g was for extra 
convenience when applying it; the statement of the lemma is also true with 
g instead of g v . □ 

The only remaining piece is Lemma l7.4[ 

Proof of Lemma [7^1 Suppose / : {-l,l} n ->■ [-1,1] and g ■ {-l,l} n -> 
{-1,1}. This does not exactly correspond to the statement of the lemma, 
but it will be more convenient for the proof; we can recover the statement 
of the lemma by replacing / by and g by ^ . 

Let e = E(/,j(£) -g(£)) 2 - Since g takes values in {-1, 1}, we have Kg 2 = 1; 
then the triangle inequality implies that 

Ef 2 > Eg 2 - 2e = 1 - 2e. 

Since Ef 2 < 1, we have 

Hf-f,) 2 - E /Kl-e-" 151 ) 2 

Sc[n] 
Sc[n] 

= E/ 2 -E/2 
< 2e. 

It then follows by the triangle inequality that E(/ - g) 2 < Ce. □ 

8 Spherical noise stability 

We now use Theorem ll.4l to prove Theorem II. Hi For a subset A c S^ 1 , we 
define A c M n to be the radial extension of A: 

A = {x e M n : x t and ^- e ^} 

From the spherical symmetry of the Gaussian distribution it immediately 
follows that Pr(j4) = The proof of Theorem 11.111 crucially relies on 
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the fact that Q P (A\,A 2 ) is close to Pr p {Ai,A 2 ) in high dimensions. More 
explicitly it uses the following lemmas: 

Lemma 8.1. For any half-space H = {x e IR n : (a,x) < b} there is a spherical 
cap B = {x 6 S n ~ l : (a, x) < b'} such that Pr(B) = Pr(H) and 

Pr(BAH) < CVT 1/2 logn. 

Lemma 8.2. For any two sets A\,A 2 c S 71 ^ 1 and any p e [-1 + e, 1 - e] it 
holds that 

\Q p (A l ,A 2 )-Pr p (A 1 ,A 2 )\ < C^n' 1 / 2 logn. 

Given Lemmas 18.21 and !8. 11 the proof of Theorem ll.lll is an easy corollary 
of Theorem 11.41 

Proof of Theorem \l.ll[ Define 5* = 6(A\, A 2 ). Let H\,H 2 be parallel half- 
spaces with Pr(Hi) = Pv(Ai), and let B\,B 2 be the corresponding caps 
whose existence is guaranteed by Lemma 18. 11 Then 

6* = S(Ax,A 2 ) 

= Pr p (X eH^Ye H 2 ) - Px p {X e A\,Y e A 2 ) 

< Pr p (X e B U Y e B 2 )-Pr p (X e A X ,Y e A 2 ) + 0(n~ 1/2 logn) 

< Q P (X e B\,Y e B 2 ) - Q p (X e A u Y e A 2 ) + 0(n l l 2 logn) 
= 5{A X ,A 2 ) + 0{n 1 l 2 logn), 

where the first inequality follows from Lemma 18.11 and the second follows 
from Lemma 18.21 

Prom Theorem 11.41 it follows that there are parallel half-spaces H\ and 
H 2 with Pi(Hi) = Pr(Ai) satisfying 

i i-p-p 2 +p 3 _ e 

Pr(AiAHi) < C(p,e)m c(p) ^ 4l - p+3p " +p:i . 
By Lemma 18.11 there are parallel caps B\ and B 2 such that 

QiAABi) = Pr(AiABi) < C(p,e)m c ^ ' S ^^^~' \ □ 
The proof of Lemma 18. II is quite simple, so we present it first: 
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Proof of Lemma \8.1\ Let H = {x € M n : (a,x) < b}, and suppose without loss 
of generality that b > 0. For any e > 0, define 

H+ = {x eR n : (a,x) < b(l + e)} 
H; = {x tR n : (a,x) < b(l - e)}. 

Note that Pr(H+ s F") < Ce. 

Now define B = {xe S^ 1 : (x,a) < Then B = {^I n : (a:, a) < 

6|x|/\/n}, and so 

Pt(B \ = Pr((l + e)b < {X, a) < b\X\/^ri) 
<Pr(|X| > (l + e)Vn) 
< Ce"^ 71 , 

where the last line follows from standard concentration inequalities (Bern- 
stein's inequalities, for example). Similarly, 

Pr(H; \ B) < Pr(\X\ < (1 - e)Vn) < Ce" ce2ri . 

Since H; c H c H+ and Pr(#+ \ iJ e ") < Ce, it follows that 

Pr(HAB) <Ce + Ce~ ce2n . 

By choosing e = Cn -1 / 2 log n, we have 

Pv(HAB) < Cn~ 1/2 log n. (8.1) 

Now, the lemma claimed that we could ensure Pr(l?) = Pr(F). Since the 
volume of the cap B' := {(a,x) < b'\x\} is continuous and strictly increasing in 

we may define b' to be the unique real number such that Pr(B') = Pr(H). 
Now, either BcB'orB'c B; hence Pr(BAB') = | Pr(B) - Pr(B')\. On the 
other hand, (|8.ip implies that 

|Pr(B) -Pr(B')l = |Pr(5) -Pr(il)| < Cn~ 1/2 log n, 

and so the triangle inequality leaves us with 

Pr(HAB') < Pr(HAB) + Pr(BAB') < Cn~ 1/2 log n. □ 

We defer the proof of Lemma 18.21 until the next section, since this proof 
requires an introduction to spherical harmonics. 
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8.1 Spherical harmonics and Lemma 18.21 



We will try to give an introduction to spherical harmonics which is as brief as 
possible, while still containing enough material for us to explain the proof of 
Lemma [8.2l adequately. A slightly less brief introduction is contained in [26] ; 
for a full treatment, see [33J. 

Let Sk be the linear space consisting of harmonic, homogeneous, degree- 
k polynomials. We will think of Sk as a subspace of L2(S n _1 , Q); then {Sk ■ 
k > 0} spans L2(S n ~ 1 ,Q). One can easily check that Sk is invariant under 
rotations. Hence it is a representation of SO(n). It turns out, moreover, 
that Sk is an irreducible representation of SO(n); combined with Schur's 
lemma, this leads to the following important property: 

Lemma 8.3. If T ■ L2(S n ~ 1 ) -* L2(S n ~ 1 ) commutes with rotations then 
{Sk '■ k > 0} are the eigenspaces ofT. 

In particular, we will apply Lemma 18.31 to the operators T p defined by 
(T p f)(X) = E(f(Y)\X), where (X,Y) ~ Q p . In other words, (T p f)(x) is the 
average of / over the set {y e S^ 1 ■ (x,y) = p}. Clearly, T p commutes with 
rotations; hence Lemma 18.31 implies that {Sk '■ k > 0} are the eigenspaces 
of T p . In particular, there exist {fJ-k(p) '■ k > 0} such that T p f = fJ.k(p)f for 
all f € Sk- Moreover, to compute Pk(p), it is enough to compute T p f for 
a single f € Sk- For this task, the Gegenbauer polynomials provide good 

candidates: define 

Gfc(t) = E(t + iW!^l-t 2 ) k , 

where the expectation is over W = (W±, . . . , W n -\) distributed uniformly on 
the sphere S n ~ 2 . Define fk(x) = Gk(x\); it turns out that fk e Sk', on the 
other hand, one can easily check that fk(ei) = 1, while (T p fk)(ei) = Gk{p)- 
From the discussion above, it then follows that 

fi k (p)^E(p + iw 1 ^r^) k . 

With this explicit formula, we can show that Pk{p) is continuous in p: 

Lemma 8.4. For any e > there exists C(e) such that if p,r) e [-1 + e, 1 - e] 

then 

M^-M^CXeXIp-^l+n- 1 / 2 ). 

We will leave the proof of Lemma 18.41 to the end. Instead, let us show 
how it can be used to prove that Q P (X e Ai,Y e A%) is continuous in p. 
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Lemma 8.5. For any e > there exists C(e) such that if p,rj e [-1 + e, 1 - e] 

then 

\Q p (X 6 A 2 ) - Q V (X t A h Y e A 2 )\ < 

< C(e)Q 1 / 2 (A 1 )Q 1 / 2 (A 2 )(\p - V \ + rT 1 / 2 ). 

Proof. Take L2(S n ~ 1 ,Q) and write / = Y^=ofk where f\ e <S/%. Then 

|E 5 T P /-E^/| < \\T p f -T^fUgh 

(where ||/||2 denotes >/E/ 2 ) and 

oo 

IIV-Vlll = E(Mifc(p)-Mifc(r7)) 2 ||/fe||^ 

fc=0 

By Lemma 18.41 we have 

\Tef-Trf\ 2 ±C{e)(\p-ri\+n-W)\fh, 

and therefore 

\EgT p f - EgT v f\ < C(e)||/|| 2 || 5 || 2 (|p - V \ + rT 1 / 2 ). 
Note that if / = l Al and 5 = U 2 then KgT p f = Q p {X e A U Y e A 2 ), while 

||/|| 2 = Q(A 1 ) 1 /2. - n 

The proof of Lemma 18,21 is straightforward once we know Lemma 18,51 
As we have already mentioned, normalized Gaussian vectors from Pr p have 
a joint distribution that is similar to Q p , except that their inner products 
are close to p instead of being exactly p. But Lemma 18.51 implies that a 
small difference in p doesn't affect the noise sensitivity by much. 

Proof of Lemma \8.S\ Let X, Y be distributed according to Pr p . Then 

Pv p (X e A U Y€ A 2 ) = Pr p e A u ^- e A 2 ), 

Note that conditioned on |X|, |Y| and (X, Y), the variables Y/|Y| are 

distributed according to Q r , where r = (X, Y)/(\X\\Y\). Now with probabil- 
ity 1 — ij- it holds that 

|X| 2 ,|Y| 2 en±Cn 1/2 log n, (X, Y) e pn ± Cn 1/2 log n. 
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On this event, we have 

(X Y 
; — m i — r) 6 P ± CrT l l 2 loen. 
|x|'|y|/ P 6 

Using Lemma 18.51 we get that 

Pr p (X e Ai,y 6 A 2 ) < Q p (X e A 1} Y e A 2 ) + C(e)rT 1/2 logn. 

A similar argument yields a bound in the other direction and concludes the 
proof. □ 

Our final task is the proof of Lemma [8 



Proof of Lemma Define Z p = p+iWxyl - p 2 so that Pk{p) = EZ p (recall 
that W = (Wi,...,W n -i) is uniformly distributed on S n ~ 2 ). Note that if 
\Wi\ < \ (which happens with probability at least 1 -exp(-cn)) then 

\Z P \ = p 2 + Wi(l - p 2 ) < ^f- < 1 - I < exp(-ce). 

Now, 

p k (p)-p k (v)=nz k p -z*) 



fc-1 



E(Z p -Z,)E%"" W - (8-2) 
3=1 



If {Wxl < \ then \Z 3 p Z% 1 j \ < exp(-ce£;) and so 

* ^exp(-ceA:) < C(e) 

i 

Applying this to (|8.2p . we have 



fc-i 



MP) -MfcWI = - Zj)l {|Wl)a/2} + El { | W r l|<1/2} (Z p - Z„) £ ^4 ' ' 

3=1 

< 2Pr(|Wi| > 1/2) + C(e)E|Z p - 

< exp(-cra) + C(e)|p - r/\, 



where E|Z p - Z v \ < C(e)\p - rj\ because \\/l-p 2 - yjl - rf \ < C{e)\p - rj\. □ 
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